Added more details to the discussion of optimizers/teleprompters. #951

rpgoldman · 2024-05-01T22:25:05Z

Add class diagrams for the teleprompters, to be included in the
documentation.

Expanded the discussion to try to clarify what the various optimizers do.

Add class diagrams for the teleprompters, to be included in the documentation. Expanded the discussion to try to clarify what the various optimizers do.

There should be no change here to the function of the code, just some documentation to make it easier for the user/maintainer to follow. Added this in the course of figuring out the teleprompters well enough to figure out how to use them.

rpgoldman · 2024-05-01T22:31:47Z

This pull request needs a bit of screening. In particular,

there are comments inline in 6-optimizers.md which indicate places where I was not sure that I was explaining correctly. Those should be corrected if necessary, and then the comments removed.
There's a minor FIXME in optuna that could be taken care of (or, if it's wrong, could just be removed).
There's another minor FIXME in bootstrap.py
There's a "QUESTION:" in bootstrap.py that shows a place where I got puzzled. It might be that the variables could be given better names, or it might be that I'm just missing something.

Finally, I don't know how docusaurus works so the method I used to put an image into 6-bootstrap.md might have been wrong.

dspy/teleprompt/teleprompt_optuna.py

arnavsinghvi11 · 2024-05-05T00:25:45Z

Hi @rpgoldman , thanks for the contributions to the documentation. These are much needed!

The diagram is great. Could you remove the other image formats and add just the png to the relevant documentation as is done for images like here:

I made a pass over the documentation and made some corrections. Also removed the in-line comments/questions and moved them here. Feel free to follow up on any if needed.

TBQH, I don't understand how Optuna does this. As far as I can tell it simply chooses best based on multiple evaluations, rather than a single one, and mention of "hyperparameters" seems to be a red herring.

Optuna is similar to the BootstrapFewShotWithRandomSearch optimization, simply replacing the random search optimization with the Optuna objective optimization, treating the candidate score as the variable to optimize over for each program candidate and running it over a set of trials. The outputted compiled program mirrors the automatic selection of few-shot examples in the prompt.

I'm not at all sure that this is right. I couldn't follow the KNN code in the repo, so I just assumed that dspy was trying to cover the space of possible examples by picking centers of different clusters.

The KNNFewShot optimizer essentially clusters the provided set of training examples and applies the fewshot optimization given this example clustering. Feel free to check out this reference notebook for how to use it!

dspy/examples/knn.ipynb

Line 141 in 733a127

"compiled_knn = knn_teleprompter.compile(BasicQABot(), trainset=trainset)"

Wouldn't it make sense to simply use LabeledFewShot with k set to use all of the demos?

This may lead to some overfitting and BootstrapFewShot in facts covers this with max_labeled_demos but also provides bootstrapped examples from the model to offer more model-representative behavior in the compiled prompt. In the case of less examples, it may make more sense to use a model with larger parameters in compile-time to have more accurate bootstrapped examples, and then use a smaller model in inference time with this learned behavior.

The following example says that "we want to "bootstrap" (i.e., self-generate) 8-shot examples of your program's steps." But won't it actually give
6 demonstrations, 3 taken from the examples (max_labeled_demos=3) and 3 self-generated (max_bootstrapped_demos=3)? Also, aren't the defaults
16 labeled + 4 bootstrapped for a total of 20-shot prompting awfully high?

Fixed the typo. The defaults are just configurations used during experiments in the paper and may not be appropriate for all use cases, hence left configurable and as maximums.

QUESTION: What is the meaning of self.validation and self.valset? Why is it that valset
overrides validation if it is supplied? What is the relationship between the valset
parameter to compile and the trainset parameter? I note that none of the examples in the
docs seem to use this parameter.

valset is simply if you have a validation split from a trainset that you would like to optimize the program on and can be particularly useful in BootstrapFewShotWithRandomSearch when determining scores over a set of candidates. This is not to be confused with an "evalset"! This is left as an optional parameter if the user wants to explore this validation split or else the optimization takes care of it with a randomized selection of train examples to bootstrap and validate on.

docs/docs/building-blocks/6-optimizers.md

rpgoldman · 2024-05-05T16:23:28Z

Hi @rpgoldman , thanks for the contributions to the documentation. These are much needed!

The diagram is great. Could you remove the other image formats and add just the png to the relevant documentation as is done for images like here:

Would it be OK to retain the .dot file, since that is the source of all the other formats? Would it help to put a comment into the Markdown file to explain how I generated the dot file? I've removed the pdf file for now.

I made a pass over the documentation and made some corrections. Also removed the in-line comments/questions and moved them here. Feel free to follow up on any if needed.

I marked 2 comments that weren't questions, just explanations, that I thought might be worth maintaining.

TBQH, I don't understand how Optuna does this. As far as I can tell it simply chooses best based on multiple evaluations, rather than a single one, and mention of "hyperparameters" seems to be a red herring.

Optuna is similar to the BootstrapFewShotWithRandomSearch optimization, simply replacing the random search optimization with the Optuna objective optimization, treating the candidate score as the variable to optimize over for each program candidate and running it over a set of trials. The outputted compiled program mirrors the automatic selection of few-shot examples in the prompt.

See my comment on the file -- even though Optuna is a hyperparameter optimizer, it doesn't look like dspy uses it for that purpose here. It looks like it's just optimizing the choice of examples, which isn't a hyperparameter.

I'm not at all sure that this is right. I couldn't follow the KNN code in the repo, so I just assumed that dspy was trying to cover the space of possible examples by picking centers of different clusters.

The KNNFewShot optimizer essentially clusters the provided set of training examples and applies the fewshot optimization given this example clustering. Feel free to check out this reference notebook for how to use it!

It wasn't clear to me what the purpose of the clustering was. That's what I was trying to explain -- does dspy use the clusters as I suggested, to make sure that the space is covered by choosing elements from different clusters, instead of choosing a bunch of examples from a single cluster.

QUESTION: What is the meaning of self.validation and self.valset? Why is it that valset
overrides validation if it is supplied? What is the relationship between the valset
parameter to compile and the trainset parameter? I note that none of the examples in the
docs seem to use this parameter.

valset is simply if you have a validation split from a trainset that you would like to optimize the program on and can be particularly useful in BootstrapFewShotWithRandomSearch when determining scores over a set of candidates. This is not to be confused with an "evalset"! This is left as an optional parameter if the user wants to explore this validation split or else the optimization takes care of it with a randomized selection of train examples to bootstrap and validate on.

One thing I still don't understand is why the term valset is used for the argument instead of devset. I will see about tweaking the docstring to clarify according to your explanation, but it might be helpful to say why this new term is introduced.

rpgoldman · 2024-05-05T16:24:12Z

P.S. I don't know what the Ruff fix is, I'm afraid. If there's a pointer somewhere that explains it, please let me know.

rpgoldman · 2024-05-05T18:26:35Z

The KNNFewShot optimizer essentially clusters the provided set of training examples and applies the fewshot optimization given this example clustering. Feel free to check out this reference notebook for how to use it!

dspy/examples/knn.ipynb

Line 141 in 733a127

"compiled_knn = knn_teleprompter.compile(BasicQABot(), trainset=trainset)"

This notebook refers to "kNN Few-Shot":

This notebook shows how KNN few-shot can be implemented...

I figure it would help to add a reference. Do you know if this article is what that refers to? If so, I could add that link to the notebook in this MR

arnavsinghvi11 · 2024-05-05T23:22:46Z

Would it be OK to retain the .dot file, since that is the source of all the other formats? Would it help to put a comment into the Markdown file to explain how I generated the dot file? I've removed the pdf file for now.

Is it important for the documentation to keep the .dot file? I think it would be best to only include final product images on the repo, as done with other documentation this way (

dspy/docs/docs/deep-dive/data-handling/built-in-datasets.mdx

Line 62 in 733a127

![Dataset Loading Process in HotPotQA Class](./img/data-loading.png)

). (We'd like to avoid adding too much non-code related files besides the hosted dspy-docs subtree).

P.S. I don't know what the Ruff fix is

Running ruff check . --fix-only and pushing will fix it!

It wasn't clear to me what the purpose of the clustering was. That's what I was trying to explain -- does dspy use the clusters as I suggested, to make sure that the space is covered by choosing elements from different clusters, instead of choosing a bunch of examples from a single cluster.

Yes, DSPy uses the KNN technique to pick a diverse set of examples from different clusters and then optimize using FewShot with examples pre-optimized using KNN (making the bootstrapping process stronger). This will be more useful when there's a lot of data over random spaces and using KNN helps optimize the trainset using for BootstrapFewShot (related to #77). The notebook details this with an example of DSPy KNN few-shot.

One thing I still don't understand is why the term valset is used for the argument instead of devset. I will see about tweaking the docstring to clarify according to your explanation, but it might be helpful to say why this new term is introduced.

I think this is also a bit semantics-related and can remain unchanged for now, unless there is a strong reason to change otherwise (and will likely need refactoring across the rest of the repo if so).

rpgoldman · 2024-05-06T14:01:23Z

Would it be OK to retain the .dot file, since that is the source of all the other formats? Would it help to put a comment into the Markdown file to explain how I generated the dot file? I've removed the pdf file for now.

Is it important for the documentation to keep the .dot file? I think it would be best to only include final product images on the repo, as done with other documentation this way (

dspy/docs/docs/deep-dive/data-handling/built-in-datasets.mdx

Line 62 in 733a127

![Dataset Loading Process in HotPotQA Class](./img/data-loading.png)

). (We'd like to avoid adding too much non-code related files besides the hosted dspy-docs subtree).

Done!

P.S. I don't know what the Ruff fix is

Running ruff check . --fix-only and pushing will fix it!

Done! I see now that it's a linter.

Add a comment to the markdown to explain how the class diagram was generated, so that it can be updated as more teleprompters are added.

rpgoldman · 2024-05-06T14:07:47Z

I added a comment to the markdown to explain the process of generating the class hierarchy figure, so that it can be updated later.

- Add key ideas from Arnav's KNN explanation (in the issue) and - Clarify that only MIPRO optimizes the demonstration set.

rpgoldman · 2024-05-06T14:19:17Z

One thing I still don't understand is why the term valset is used for the argument instead of devset. I will see about tweaking the docstring to clarify according to your explanation, but it might be helpful to say why this new term is introduced.

I think this is also a bit semantics-related and can remain unchanged for now, unless there is a strong reason to change otherwise (and will likely need refactoring across the rest of the repo if so).

I think it would be best to simply note this deviation from the otherwise standard use of "devset" somewhere in the documentation. If one wanted to do more, I'd say just introduce devset as an alternative parameter name, and bind valset to the value of the devset parameter if supplied. In the best of all possible worlds, I'd suggest trying to make the usage consistent across the library, but this is only a minor point.

rpgoldman · 2024-05-06T14:20:44Z

If you are happy with what's there now, I think it's ok to merge.

rpgoldman · 2024-05-06T14:21:12Z

@arnavsinghvi11 Your explanation of KNN was very helpful; I pulled a couple of sentences into the Markdown.

rpgoldman · 2024-05-06T14:23:32Z

P.S. The pointer to the KNN notebook probably should go somewhere else, but I suggest keeping it here until there's a page for the KNN optimizer added to the Teleprompters/Optimizers section of the "Deep Dive."

rpgoldman · 2024-05-07T03:11:08Z

Sorry, forgot to clear the "Draft" flag.

arnavsinghvi11 · 2024-05-11T17:52:30Z

docs/docs/building-blocks/6-optimizers.md



 ## What DSPy Optimizers are currently available?

+<!-- The following diagram was generated by: -->


Please give yourself credit! :)

arnavsinghvi11 · 2024-05-11T17:55:29Z

Thanks @rpgoldman for this amazing PR on documentation. Left a small comment for you to give yourself credit for the PNG and should be ready to merge. (I left the comments you had for generating the PNG since it makes sense for that process, but lmk if you wanted to remove that before merging).

arnavsinghvi11 · 2024-05-11T21:56:49Z

Thanks @rpgoldman !

Added more details to the discussion of optimizers/teleprompters.

rpgoldman added 2 commits May 1, 2024 17:21

Added more details to the discussion of optimizers/teleprompters.

f3e77d0

Add class diagrams for the teleprompters, to be included in the documentation. Expanded the discussion to try to clarify what the various optimizers do.

Comments, type hints, etc. added to optimizers.

fc15eb2

There should be no change here to the function of the code, just some documentation to make it easier for the user/maintainer to follow. Added this in the course of figuring out the teleprompters well enough to figure out how to use them.

arnavsinghvi11 added 2 commits May 4, 2024 17:04

Update 6-optimizers.md with corrections

06bf5a7

Update bootstrap.py and remove comments

34e0058

arnavsinghvi11 reviewed May 5, 2024

View reviewed changes

dspy/teleprompt/teleprompt_optuna.py Outdated Show resolved Hide resolved

Update copro_optimizer.py docstring

c99df69

rpgoldman commented May 5, 2024

View reviewed changes

docs/docs/building-blocks/6-optimizers.md Outdated Show resolved Hide resolved

docs/docs/building-blocks/6-optimizers.md Outdated Show resolved Hide resolved

docs/docs/building-blocks/6-optimizers.md Outdated Show resolved Hide resolved

rpgoldman added 2 commits May 5, 2024 11:15

Remove unused PDF file.

762751f

Remove unnecessary return value.

e0b8a69

rpgoldman added 2 commits May 6, 2024 08:59

Remove graphviz file of class hierarchy.

e00d4e6

ruff fixes.

33ea790

Describe how to generate class diagram.

dc0d78b

Add a comment to the markdown to explain how the class diagram was generated, so that it can be updated as more teleprompters are added.

Update description of KNN and Instruction Optimizers.

5e4df24

- Add key ideas from Arnav's KNN explanation (in the issue) and - Clarify that only MIPRO optimizes the demonstration set.

Proposed Optuna rewrite.

285b0dc

erg approved these changes May 6, 2024

View reviewed changes

rpgoldman marked this pull request as ready for review May 7, 2024 03:10

arnavsinghvi11 approved these changes May 11, 2024

View reviewed changes

Add braggy comment.

48c7e90

arnavsinghvi11 merged commit bcf47c8 into stanfordnlp:main May 11, 2024
4 checks passed

arnavsinghvi11 added a commit that referenced this pull request Jul 12, 2024

Merge pull request #951 from rpgoldman/teleprompter-details

492fb9d

Added more details to the discussion of optimizers/teleprompters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added more details to the discussion of optimizers/teleprompters. #951

Added more details to the discussion of optimizers/teleprompters. #951

rpgoldman commented May 1, 2024

rpgoldman commented May 1, 2024

arnavsinghvi11 commented May 5, 2024 •

edited

Loading

rpgoldman commented May 5, 2024

rpgoldman commented May 5, 2024

rpgoldman commented May 5, 2024

arnavsinghvi11 commented May 5, 2024

rpgoldman commented May 6, 2024

rpgoldman commented May 6, 2024

rpgoldman commented May 6, 2024

rpgoldman commented May 6, 2024

rpgoldman commented May 6, 2024

rpgoldman commented May 6, 2024

rpgoldman commented May 7, 2024

arnavsinghvi11 May 11, 2024

arnavsinghvi11 commented May 11, 2024

arnavsinghvi11 commented May 11, 2024



		## What DSPy Optimizers are currently available?

		<!-- The following diagram was generated by: -->

Added more details to the discussion of optimizers/teleprompters. #951

Added more details to the discussion of optimizers/teleprompters. #951

Conversation

rpgoldman commented May 1, 2024

rpgoldman commented May 1, 2024

arnavsinghvi11 commented May 5, 2024 • edited Loading

rpgoldman commented May 5, 2024

rpgoldman commented May 5, 2024

rpgoldman commented May 5, 2024

arnavsinghvi11 commented May 5, 2024

rpgoldman commented May 6, 2024

rpgoldman commented May 6, 2024

rpgoldman commented May 6, 2024

rpgoldman commented May 6, 2024

rpgoldman commented May 6, 2024

rpgoldman commented May 6, 2024

rpgoldman commented May 7, 2024

arnavsinghvi11 May 11, 2024

Choose a reason for hiding this comment

arnavsinghvi11 commented May 11, 2024

arnavsinghvi11 commented May 11, 2024

arnavsinghvi11 commented May 5, 2024 •

edited

Loading