Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predict an image crop of nutrition tables #311

Open
raphael0202 opened this issue May 26, 2023 · 1 comment
Open

Predict an image crop of nutrition tables #311

raphael0202 opened this issue May 26, 2023 · 1 comment
Labels

Comments

@raphael0202
Copy link
Contributor

This issue is meant to track the progress and previous work made on nutrition table detection and cropping.

Previous work

As part of Google Summer of Code 2018, a student (Sagar) trained an object detection model to detect nutritional tables based on an annotated dataset of nutrition table images. This model has never been integrated into Robotoff.
In December 2019, using Sagar training dataset as baseline, I created a new annotation campaign to enrich the dataset and fix some errors, raising the number of samples to ~1k.
A new model was trained from this dataset using Tensorflow Object Detection API library: https://github.com/openfoodfacts/robotoff-models/releases/tag/tf-nutrition-table-1.0
This is the current object detection model used in production. Until really recently, we didn't use this model predictions. As of 26th of May 2023, we detect nutrition images based on nutrient mentions ("salt", "energy", "saturated fat",...) and values ("15g", "255 kcal",...).
The object detection model prediction is only used to predict an image crop, when the model confidence is really high (>= 0.9).
Indeed, the model predictions are not reliable enough to use a lower threshold. Consequently, we noticed that most nutrition_image predictions don't have a predicted crop, which means we use the full image as selected image. This is something we would like to change to switch to a fully automated nutrition image selection & crop.

Proposal

I started to implement on Robotoff a simple algorithm to predict a crop based on nutrient mention and values: the idea was to select the minimal bounding box that includes all detected nutrient mentions/values. It improved the result over uncropped images. However, we still had some issues:

  1. words that were ingredients but detected as nutrient mention ("sugar", "salt",..) or product weight detected as nutrient values ("25g") were included in the crop
  2. recommended daily intake percentages of nutrients were not necessarily included in the table, as it's not something we detect as nutrient mention.

I started to think about a way of performing clustering to deal with outliers to solve issue (1), when I realized we could use a supervised machine learning model to detect words that belong to the nutritional table, using as input:

  • the bounding box of the word
  • the word string content
  • whether it's a detected nutrient mention or value (optional)

Using the annotated dataset + JSON OCRs, we train a graph model that predict if each word is part of the nutrition table, based on the word content + neighbors. The object detection model only uses the raw image as input and doesn't have access to the text content, which explains why it doesn't perform this well (it probably only considers table shapes to predict nutrition tables).

I expect this model to perform much better than the object detector. It would also detect nutrition information displayed as text, which is something the object detector struggled to do (unsurprisingly).

@ItshMoh
Copy link

ItshMoh commented Aug 7, 2023

hey @raphael0202 can you guide me about from where should I start this problem. If you can share the dataset of images of nutrition tables.

@teolemon teolemon added the ✨ enhancement New feature or request label May 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants