Skip to content

v0.10.0

Latest
Compare
Choose a tag to compare
@b-chu b-chu released this 02 Jul 13:31
742f340

馃殌 LLM Foundry v0.10.0

New Features

Registry for ICL datasets (#1252)

ICL datasets have now been added as a registry.

Curriculum Learning Callback (#1256)

You can now switch dataloaders while training which enables curriculum learning.

train_loader:
  <dataloader parameters>
callback:
  curriculum_learning:
  - duration: <number>tok
    train_loader:  # matches top level train_loader
      <dataloader parameters>
  - duration: <number>tok
    train_loader:
      <dataloader parameters>
  - duration: <number>tok
    train_loader:
      <dataloader parameters>

[Experimental] Interweave Attention Layers (#1299)

You can now override default block configs for certain layers, allowing for different sliding window sizes, reusing the previous layer's kv cache, etc.

model:
    ...
    (usual model configs)
    ...
    block_overrides:
        order:
        - name: default
        - order:
          - name: sliding_window_layer
          - name: sliding_window_layer_reuse
          - name: sliding_window_layer
          - repeat: 2
            name: sliding_window_layer_reuse
          - name: reuse_kv_layer
          repeat: 2
        overrides:
            sliding_window_layer:
                attn_config:
                    sliding_window_size: 1024
            sliding_window_layer_reuse:
                attn_config:
                    sliding_window_size: 1024
                    reuse_kv_layer_idx: -1 # Relative index of the layer whose kv cache to reuse
            reuse_kv_layer:
                attn_config:
                    reuse_kv_layer_idx: -6 # Relative index of the layer whose kv cache to reuse

Bug fixes

What's Changed

New Contributors

Full Changelog: v0.9.1...v0.10.0