Skip to content

investigate Quiet-STaR paper, and it's thought scratchpad

License

Notifications You must be signed in to change notification settings

wassname/quiet-star

 
 

Repository files navigation

Experiments with seeing the secret thoughts of LLM's

I had a play with Quiet-STaR

It has "private thoughts" that have never been tuned to human preferences. I was curious about it's private thought.

  • occasionally there is a glimpse of duplicity
  • mostly is garbled, or unrelated like regular CoT
  • curious about a larger model

The thoughts share these properties with normal Chain Of Thought

  • the conclusion is sometimes not faithfull to the reasoning
  • it's sometimes garbled (normal for a small 7b parameter model)

I think all these differences could become more distince in a larger model. And it's fascinating to see thoughts that have been trained to be effective, not to please humans. However some of the thoughts contradict each other, so it would be even nicer if we could somehow make sure that they are used, but this is an open research question.

Links:

image

Quiet-STaR

Code for Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking.

This project is implemented by simply patching the base Mistral implementation in Huggingface transformers using a new modeling_mistral.py and a new configuration_mistral.py and otherwise applying standard transformers features (e.g. the default Trainer). Our patches were applied to Huggingface's transformers version 4.37.0.dev0 under src/transformers/models/mistral/ -- we cannot guarantee that other changes to their implementation will not affect our implementation, so for reproducibility, we encourage using the same version.

One pitfall to be wary of: the model is not taught not to generate start and end thought tokens. Thus, when performing actual inference, it is necessary to mask these out.

We make an 8-thought-token ahead (including start and end tokens) model available via Huggingface.

About

investigate Quiet-STaR paper, and it's thought scratchpad

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 78.9%
  • Python 21.1%