Yep:
https://openai.com/index/learning-to-reason-with-llms/
First interactive section. Make sure to click “show chain of thought.”
The cipher one is particularly interesting, as it’s intentionally difficult for the model.
The tokenizer is famously bad at two letter counts, which is why previous models can’t count the number of rs in strawberry.
So the cipher depends on two letter pairs, and you can see how it screws up the tokenization around the xx at the end of the last word, and gradually corrects course.
Will help clarify how it’s going about solving something like the example I posted earlier behind the scenes.
Actually, they are hiding the full CoT sequence outside of the demos.
What you are seeing there is a summary, but because the actual process is hidden it’s not possible to see what actually transpired.
People are very not happy about this aspect of the situation.
It also means that model context (which in research has been shown to be much more influential than previously thought) is now in part hidden with exclusive access and control by OAI.
There’s a lot of things to be focused on in that image, and “hur dur the stochastic model can’t count letters in this cherry picked example” is the least among them.