Release Note Finetuner 0.7.3

This release covers Finetuner version 0.7.3, including dependencies finetuner-api 0.5.4 and finetuner-core 0.12.9.

This release contains 4 new features, 1 refactoring, and 1 bug fix.

🆕 Features

Automatic batch size configuration (#691)

It can be complicated to find a good batch size for the finetuner.fit function. If you choose a value that's too small, fine-tuning may not be very effective; if you choose a value that's too large, the job may run out of memory and fail. To make this easier, Finetuner now sets the batch size automatically if you leave out the batch_size parameter to finetuner.fit or set it to None. This will choose the largest batch size supported by the current CUDA device.

Retrieving evaluation metrics (#687)

You no longer need to retrieve the logs of Finetuner runs or manually unpack fine-tuned models to get the evaluation metrics. Now, you can get the metrics directly from the Run.metrics function:

run = finetuner.fit(...)
metrics = run.metrics()

To print nicely formatted evaluation metrics to the console, use the Run.display_metrics() function. This will print tables showing evaluation metrics before and after fine-tuning:

Calculating example results (#687)

In addition to evaluation metrics, you may find it helpful to see actual query results. We have introduced a new parameter gather_examples to the evaluation callback to make this easy. If this parameter is set to True, the evaluation callback also tracks the Top-K results for some example queries samples from the query dataset:

run = finetuner.fit(
    ...,
    callbacks=[
        EvaluationCallback(
            query_data='query-data-name'',
            index_data='index-data-name'',
            gather_examples=True,
        )
    ],
    ...
)

Like the evaluation metrics, you can retrieve the query results, before and after fine-tuning, with the Run.example_results function or print them on the console using Run.display_examples:

Similarity-based training and Cosine Similarity Loss

Finetuner now supports training with data that is not specifically labeled, but that has for each pair of training items, a numerical similarity score between 0.0 (totally dissimilar) and 1.0 (totally the same). This extends the potential scenarios for which Finetuner is applicable.

For example, you can now use DocArray to prepare data pairs with scores like this:

from docarray import Document, DocumentArray
d1 = Document(
    [Document(text='I am driving to Los Angeles'), Document('I am driving to Hollywood')],
    tags={'finetuner_score': 0.9},
)
d2 = Document(
    [Document(text='I am driving to Los Angeles'), Document('I am flying to New York')],
    tags={'finetuner_score': 0.3},
)
...
train_data = DocumentArray([d1, d2, ...])

Then, use CosineSimilarityLoss as the loss function in the finetuner.fit function:

finetuner.fit(
  model='sentence-transformers/msmarco-distilbert-base-v3',
  train_data=train_data,
  loss='CosineSimilarityLoss',
  ...
)

In the future, we will also support data with scores in CSV format.

⚙ Refactoring

Remove job limit

Previously, users could only run three jobs in parallel. This limit has been removed.

🐞 Bug Fixes

Logs become unavailable after some time

After fine-tuning jobs finish, the logs were lost after some length of time. Now, logs will remain available indefinitely.

🤟 Contributors

We would like to thank all contributors to this release:

Wang Bo (@bwanglzu)
Louis Milliken (@LMMilliken)
Michael Günther (@guenthermi)
George Mastrapas (@gmastrapas)
Scott Martens (@scott-martens)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.7.3