[docs] LLM prompting guide #26274

MKhalusova · 2023-09-19T20:01:49Z

What does this PR do?

This PR addresses part 2.2 ("Prompting" ) of the issue #24575 

It adds an LLM Prompting Guide to the docs that covers the following topics:

basics of prompting,
encoder-decoder models vs decoder-only models,
base vs instruct models,
basic prompts to solve common NLP tasks,
best practices for prompting,
advanced techniques like few-shot learning and chain-of-thought
prompting vs fine-tuning

Let me know, if there's anything missing that has to be included.

HuggingFaceDocBuilderDev · 2023-09-20T18:26:24Z

The documentation is not available anymore as the PR was closed or merged.

MKhalusova · 2023-09-20T18:43:46Z

The first draft of the LLM prompting guide is ready for review, let me know if anything major is missing. cc @patrickvonplaten

patrickvonplaten

Very nice! My main feedback is:

Let's maybe not pass default parameters to make sure the pipeline call function stays simple. E.g. I think we can remove all the num_return_sequences=1 statements as well as the eos_token_id=... statements as the model should have that set as the default (see here).
For task that do pure classification (sentiment analysis) or NER where there is arguably one and only one answer and where the model only generates a few tokens, I think it'd be better to not set do_sample=True and instead leave the sampling to be greedy - I don't think we want to introduce any randomness there. Maybe a 1-2 liner explaining the difference could also help there

docs/source/en/tasks/prompting.md

patrickvonplaten · 2023-09-21T14:31:07Z

docs/source/en/tasks/prompting.md

+
+This is a wrong answer, it should be 12. In this case, this can be due to the prompt being too basic, or due to the choice 
+of model, after all we've picked the smallest version of Falcon. Reasoning is difficult for models of all sizes, but larger 
+models are likely to perform better. 


Nice I like this!

patrickvonplaten · 2023-09-21T14:31:45Z

docs/source/en/tasks/prompting.md

+* "Lead" the output in the right direction by writing the first word (or even begin the first sentence for the model).
+* Use advanced techniques like [Few-shot prompting](#few-shot-prompting) and [Chain-of-thought](#chain-of-thought)
+* Test your prompts with different models to assess their robustness. 
+* Version and track the performance of your prompts. 


Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

.circleci/create_circleci_config.py

LysandreJik

Looks great :)

Only left a few comments

docs/source/en/tasks/prompting.md

Co-authored-by: Lysandre Debut <hi@lysand.re>

LysandreJik · 2023-09-28T08:10:46Z

Feel free to merge when satisfied with it!

MKhalusova · 2023-09-29T12:24:27Z

@LysandreJik I'm happy with it, but I think we should wait for @gante to review it once he's back from vacation.

MKhalusova · 2023-10-11T15:20:11Z

Gently pinging @gante for a review :)

gante

This is a great guide, with clear examples, great suggestions, and relevant caveats. Big thumbs up, thank you for writing this guide @MKhalusova 💛

* llm prompting guide * updated code examples * an attempt to fix the code example tests * set seed in examples * added a doctest comment * added einops to the doc_test_job * string formatting * string formatting, again * added the toc to slow_documentation_tests.txt * minor list fix * string formatting + pipe renamed * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * replaced max_length with max_new_tokens and updated the outputs to match * minor formatting fix * removed einops from circleci config * Apply suggestions from code review Co-authored-by: Lysandre Debut <hi@lysand.re> * removed einops and trust_remote_code parameter --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Lysandre Debut <hi@lysand.re>

MKhalusova added 10 commits September 19, 2023 16:00

llm prompting guide

40158bf

updated code examples

cacd126

an attempt to fix the code example tests

20c2c7e

set seed in examples

9756f54

added a doctest comment

ddc678c

added einops to the doc_test_job

457c228

string formatting

9784247

string formatting, again

3762bd6

added the toc to slow_documentation_tests.txt

7f45ef1

minor list fix

745194a

MKhalusova requested review from gante and patrickvonplaten September 20, 2023 18:42

string formatting + pipe renamed

0d0d1dd

patrickvonplaten reviewed Sep 21, 2023

View reviewed changes

MKhalusova and others added 3 commits September 21, 2023 11:03

Apply suggestions from code review

bc7fcfc

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

replaced max_length with max_new_tokens and updated the outputs to match

22b89e6

minor formatting fix

c7e5a5f

ydshieh reviewed Sep 22, 2023

View reviewed changes

.circleci/create_circleci_config.py Outdated Show resolved Hide resolved

removed einops from circleci config

ffee297

MKhalusova marked this pull request as ready for review September 22, 2023 13:39

patrickvonplaten approved these changes Sep 26, 2023

View reviewed changes

LysandreJik approved these changes Sep 27, 2023

View reviewed changes

docs/source/en/tasks/prompting.md Outdated Show resolved Hide resolved

docs/source/en/tasks/prompting.md Outdated Show resolved Hide resolved

docs/source/en/tasks/prompting.md Show resolved Hide resolved

MKhalusova and others added 2 commits September 27, 2023 09:03

Apply suggestions from code review

4e6f34f

Co-authored-by: Lysandre Debut <hi@lysand.re>

removed einops and trust_remote_code parameter

1eb43aa

MKhalusova mentioned this pull request Sep 28, 2023

stopping criteria for TextGenerationPipeline #26280

Closed

gante approved these changes Oct 11, 2023

View reviewed changes

Merge branch 'main' into llm_prompting_guide

f688ad4

MKhalusova merged commit 0ebee8b into huggingface:main Oct 12, 2023
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[docs] LLM prompting guide #26274

[docs] LLM prompting guide #26274

MKhalusova commented Sep 19, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 20, 2023 •

edited

Loading

MKhalusova commented Sep 20, 2023

patrickvonplaten left a comment

patrickvonplaten Sep 21, 2023

patrickvonplaten Sep 21, 2023

LysandreJik left a comment

LysandreJik commented Sep 28, 2023

MKhalusova commented Sep 29, 2023

MKhalusova commented Oct 11, 2023

gante left a comment

[docs] LLM prompting guide #26274

[docs] LLM prompting guide #26274

Conversation

MKhalusova commented Sep 19, 2023 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Sep 20, 2023 • edited Loading

MKhalusova commented Sep 20, 2023

patrickvonplaten left a comment

Choose a reason for hiding this comment

patrickvonplaten Sep 21, 2023

Choose a reason for hiding this comment

patrickvonplaten Sep 21, 2023

Choose a reason for hiding this comment

LysandreJik left a comment

Choose a reason for hiding this comment

LysandreJik commented Sep 28, 2023

MKhalusova commented Sep 29, 2023

MKhalusova commented Oct 11, 2023

gante left a comment

Choose a reason for hiding this comment

MKhalusova commented Sep 19, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 20, 2023 •

edited

Loading