“Thinking” is just an arbitrary process to generate additional prompt tokens. In their training data now, they’ve realized people suck at writing prompts, and that it was clear their models lack causal or state models of anything. They’re simply good at word substitution to a context that is similar enough to the prompt they’re given. So a solution to sucky prompt writing and trying to sell people on its capacity (think full self driving — it’s never been full self driving, but it’s marketed that way to make people think it is super capable) is to simply have the model itself look up better templates within its training data that tend to result in better looking and sounding answers.
The thinking is not thinking. It’s fancier probabilistic look up.
“Thinking” is just an arbitrary process to generate additional prompt tokens. In their training data now, they’ve realized people suck at writing prompts, and that it was clear their models lack causal or state models of anything. They’re simply good at word substitution to a context that is similar enough to the prompt they’re given. So a solution to sucky prompt writing and trying to sell people on its capacity (think full self driving — it’s never been full self driving, but it’s marketed that way to make people think it is super capable) is to simply have the model itself look up better templates within its training data that tend to result in better looking and sounding answers.
The thinking is not thinking. It’s fancier probabilistic look up.