Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
SDG Pipeline.png		SDG Pipeline.png
synthetic_preference_data_generation_llama_3_1_405b.ipynb		synthetic_preference_data_generation_llama_3_1_405b.ipynb

README.md

Synthetic Preference Data Generation Using Nemotron-4 340B

The provided notebook will demonstrate how to leverage Llama 3.1 405B Instruct, and Nemotron-4 340B Reward through build.nvidia.com.

The build will be a demonstration of the following pipeline!

The pipeline is designed to create a preference dataset suitable for training a custom reward model using the SteerLM method, however consecutive responses (e.g. sample 1 with 2, 3 with 4, etc.) share the same prompt so the dataset can also be used for preference pairs for training an RLHF Reward Model or for DPO - using the helpfulness score.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

synthetic-preference-data

synthetic-preference-data

README.md

Synthetic Preference Data Generation Using Nemotron-4 340B

Files

synthetic-preference-data

Directory actions

More options

Directory actions

More options

Latest commit

History

synthetic-preference-data

Folders and files

parent directory

README.md

Synthetic Preference Data Generation Using Nemotron-4 340B