I spend a lot of time reading the “thinking traces” of Large Language Models.
Thinking traces are the tokens (the fundamental unit of LLM output) that get expended before you ever see the result from an AI chatbot or system.
If you use ChatGPT 5.2 with “Thinking” turned on, for instance, you may have noticed that it takes a little while before it starts issuing its response. What it’s doing during that time is simulating how humans think out loud by, essentially, talking itself through a reasoning process.
What’s remarkable is that this actually works. When you have a model pretend to think things through, its ultimate output gets better.
I read the thinking traces because, in the LLM-powered apps and tools I’ve made, the thinking gives you clues to why it came to its final output. It’ll self-report why it did or didn’t do something, which can inspire better prompts, better harnesses and infrastructure, or better safety mechanisms.
And something you might notice when you study the thinking traces is that the models correct themselves, a lot. If you look at how Claude Code thinks through a plan to make a programming decision, it’ll often stop itself and say, “Wait,” and then go off in a better, more effective direction.
But it gets even more interesting: Self-correction is not necessarily default behaviour.
You can actually teach it to the model after it’s been trained, using something called Supervised Fine Tuning. That’s where you take a model that’s already gone through its training process and then give it a bunch of examples of the types of output you’d like to see, and it learns to imitate those examples.
I learned about this last year, listening to an episode of This Week in Machine Learning AI. The researcher guest was able to take an off-the-shelf, non-“reasoning” model and turn it into a reasoning model, in part by making it wait.
When the model went to emit its “stop token,” the particular token that tells the software running the LLM to stop until the user responds, the researchers would suppress it and replace it with the word “wait” and get it to keep outputting.
So, to the LLM, it appeared that it hadn’t finished its output and in fact had more ideas, perhaps ones that even contradicted what had come before.
Essentially, they taught the LLM to keep thinking even when it thought it was done, by getting it to say, “Wait,” and then keep going.
They discovered that you could do this to the model up to four times before the outputs would inevitably degrade. It’s as if there was a hidden, latent well of knowledge that had to be intentionally tapped.
Isn’t that interesting? Isn’t that inspiring? One research paper called these types of tokens “reasoning sparks.”
But, wait... I don’t think this is just about LLMs. I think it works on us, too.
It reminds me of the saying: “Genius hesitates.” Great, innovative, novel work is the result of thought, not just action.
This feels like another way to “prompt our own uniqueness,” by giving ourselves special phrases we can use to, effectively, trick ourselves into having better ideas.
Or you could think of this in terms of George Saunders’ “ritual banality avoidance,” where you deny yourself the easy sentence, the easy phrase, the easy content, and you push through to something grander, more profound, more useful.
Which isn’t how I’d describe most marketing content these days. When I scroll through LinkedIn or even watch ads from the Big Game, what I see is rushing. Panicking. Getting the thing done before the next interruption or reaction.
I think we can do better. I think marketers and entrepreneurs can do better.
Not by speeding up, or arming themselves with endless automations, but by slowing down. At least at the thinking stage.
By waiting yourself.
This is how it works:
The next time you write a LinkedIn post, or a newsletter, or a sales message, or some copy for your website, do this:
Write out the first version, or even just the first paragraph, and then write “Wait—” And keep writing.
Or, “But, that means...” and then the next thought you have.
Or, “Which tells us that...” and then make a new connection.
What this does is slow you down. It forces you to reject the easy, obvious, or rote answer and get down to things that only you know.
Now I can hear some saying, “This might work on LLMs, but it doesn’t work on me,” to which I say, “Try it and then tell me.”
Because I believe that if you slow down, if you give yourself not just time but triggers to pause and consider, your content will get better.
It will get more interesting, more novel, more useful.
Because nobody needs more content merely done and shoved online.
They don’t need the empty slickness of AI slop.
They need the texture of humanity, of insight, of experience.
So don’t rush your content like it’s a task on the to-do list. Take your time like it’s important, like it matters, because it does.
It’s how you demonstrate your value at a distance.
It’s how you show your prospects that you’re not just another provider, another business, another bill.
You’re someone who knows what you’re talking about because you know what you’re thinking about.
But wait…
You know, it’s funny, thinking slows LLMs down. A lot. Those thinking tokens can chew up API credits, too, costing the people who use them more and more.
But, in my apps, I turn thinking tokens on. I’m willing to pay the cost of waiting.
Why? Because when something really matters, it matters to get it right.
And I think your marketing content matters.
Which means, sometimes, the best way to make content isn’t to do more, faster.
It’s to wait.
Until you get the right answer.
Kelford Inc. shows you the way to always knowing what to say.