Skip to content

Commit

Permalink
Update rlhf.md (#1178) [skip ci]
Browse files Browse the repository at this point in the history
  • Loading branch information
AlekseyKorshuk committed Jan 23, 2024
1 parent 59a31fe commit dc051b8
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions docs/rlhf.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,14 @@ The various RL training methods are implemented in trl and wrapped via axolotl.

#### DPO
```yaml
rl: true
rl: dpo
datasets:
- path: Intel/orca_dpo_pairs
split: train
type: intel_apply_chatml
type: chatml.intel
- path: argilla/ultrafeedback-binarized-preferences
split: train
type: argilla_apply_chatml
type: chatml.argilla
```

#### IPO
Expand Down

0 comments on commit dc051b8

Please sign in to comment.