Skip to content

Releases: openfoodfacts/robotoff-models

pytorch-ingredient-detection-1.0

16 Aug 15:02
8bba859
Compare
Choose a tag to compare

This ingredient detection model was trained on the ingredient detection dataset v1 using code in this version of the repository.
Training was tracked on Wandb.

More information on experiments performed on can be found in this document.
This release provides the following assets:

Training-related assets:

  • predictions.tar.gz: predictions on train and test dataset of the model, in:
    • HTML format: easier to view
    • JSONL format: either the raw or the aggregated (post-processed) version
  • model-huggingface.tar.gz: the HuggingFace serialized model

Serving assets:

  • onnx.tar.gz: the model exported to ONNX format

keras-category-classifier-image-embeddings-3.0

14 Mar 11:42
8bba859
Compare
Choose a tag to compare

This category classification model was trained on the v4 Data For Good 2022 category dataset using code in this version of the off-category-classification repository.
Training was tracked on Wandb.

This release provides the following assets:

Dataset assets:

  • predict_categories_dataset_products.jsonl.gz: product selected fields.
  • predict_categories_dataset_images_ids.jsonl.gz: IDs of images associated with each product.
  • predict_categories_dataset_ocrs.jsonl.gz: extracted OCR texts for each product.
  • (train|test|val).txt: train, test and validation splits (list of barcodes).

Training-related assets:

  • config.json providing the parameter configuration used during training.
  • categories.full.json.gz containing the category taxonomy version used in this model's training.
  • ingredients.full.json.gz containing the ingredient taxonomy version used in this model's training.
  • training.log: training logs.

Validation assets:

  • classification_report_(test|val).json is the classification report for test/val datasets.
  • threshold_report_0.99.json: category-specific thresholds required to reach a precision >= 0.99 on a merged validation + test set.
  • (test|val)_top_predictions.tsv: top-10 predictions on validation/test sets.

Serving assets:

  • saved_model.tar.gz containing the model saved in SavedModel format.

clip-vit-base-patch32

16 Dec 15:33
8bba859
Compare
Choose a tag to compare

ONNX export of CLIP-ViT base patch-32.
Exported with HuggingFace Transformers (v4.25.1) with Pytorch backend, ONNX opset 17.

tf-universal-logo-detector-1.0

09 Sep 15:20
Compare
Choose a tag to compare

Universal logo detection model: detects generic logos. trained on 2019-12-13. The model detects the following objects:

  • brand (all brand logos)
  • label (remaining logos)

Training and validation data (TFRecords files) can be found in data.zip.
Tensorflow SavedModel files can be found in saved_model.tar.gz, all checkpoints (intermediate and final checkpoint) in checkpoints.tar.gz. Tensorboard event files are also attached.

The model was trained using Tensorflow Object Detection API: https://github.com/tensorflow/models/tree/60bb50675ed7fab3afd05edab02a45acee57532a

Base model: Faster-RCNN ResNet-101 pretrained on COCO dataset: http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_2018_01_28.tar.gz

An ONNX export (model.onnx) using opset 13 is also attached.

tf-nutrition-table-1.0

09 Sep 14:53
Compare
Choose a tag to compare

Nutrition table detection model, trained at 2019-12-03. The model detects the following objects:

  • nutrition-table
  • nutrition-table-small
  • nutrition-table-small-energy
  • nutrition-table-text

Training and validation data (TFRecords files) can be found in data.zip. Before train-val split, the dataset was obtained by merging 3 annotated datasets on the annotation interface:

  • nutrition-table-1
  • Nutrition table (Sagar)
  • nutrition-table-2

Tensorflow SavedModel files can be found in saved_model.tar.gz.

The model was trained using Tensorflow Object Detection API: https://github.com/tensorflow/models/tree/e3f8ea2227ef5ce67df04bd175e6c20711079d8f

An ONNX export (model.onnx) using opset 13 is also attached.

tf-nutriscore-1.0

16 Sep 06:27
Compare
Choose a tag to compare

Provides configuration file, serialized model and training/validation data for the nutriscore object detection model. More specifically, includes:

  • the label pbtxt file (labels.pbtxt)
  • the training configuration file (pipeline.config)
  • training (train.record) and validation (val.record) data
  • model checkpoint (model.ckpt-*)
  • the frozen inference graph (frozen_inference_graph.pb)
  • the saved model (in saved_model.tar.gz), for use in Tensorflow Serving

An ONNX export (model.onnx) using opset 13 is also attached.

keras-category-classifier-xx-2.0

30 Nov 14:39
Compare
Choose a tag to compare

This category classification model was trained on the 2021-09-15 multi-lingual dataset, using code in this repository.

This release provides the following assets:

Training-related assets:

  • config.json providing the parameter configuration used during training.
  • category_voc.json specifying the mapping between the model's outputs and the taxonomised categories.
  • category_taxonomy.json containing the category taxonomy version used in this model's training.
  • training_model.tar.gz containing the training model that can be used for further training of the model.

Validation assets:

  • classification_report_(test|val).json is the classification report for test/val datasets.
  • metrics_(test|val).json is the model's performance metrics for test/val datasets.

Serving assets:

  • serving_model.tar.gz containing the TF Serving-compatible model with an additional output layer that will convert the raw vocabulary indices to category strings.

keras-category-classifier-xx-1.0

05 Dec 17:17
Compare
Choose a tag to compare

This category classification model was trained on the 2019-09-16 multilingual (xx) dataset, using the code contained in this repository.

Provides:

  • the configuration file (config.json)
  • assets (category_taxonomy.json, category_voc.json, ingredient_voc.json, product_name_voc.json)
  • the keras hdf5 checkpoint (checkpoint.hdf5)
  • classification reports and metrics on the test and val sets, for the whole set or splitted by major language

category-predictor-xgfood-emlyon-1.0

category-predictor-ocr-lewagon-1.0

07 May 10:30
Compare
Choose a tag to compare

Model trained by students from the bootcamp Le Wagon in March 2021.
Based on a RidgeClassifier (sklearn) trained on text from OCRed images.
Output is a confidence for each possible category, out of a short of 38 categories.

For more details, see https://github.com/Laurel16/OpenFoodFactsCategorizer