Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a multimodal LLM to extract the likely Product name and brand as JSON #341

Open
Tracked by #5009
teolemon opened this issue Apr 25, 2024 · 0 comments
Open
Tracked by #5009

Comments

@teolemon
Copy link
Member

Problem

  • We get very little product names using the official app, because it is cumbersome to input

Proposed solution

  • Use a multimodal LLM like Gemini to extract the likely Product name and brand as JSON, if and only if:
  • the user requests it (with a button ?)
  • no product name input has been made 2hrs later (in that case, create a detection and insight to be applied)

Additional context

Mockups

image
Can you provide the likely product name, and the brand as a JSON output ?
Yes, the product in the image is likely Pedigree Dentastix Chewy Chunks Small/Medium Chicken Flavor Dog Treats.

Here is the information in JSON format:

{
  "brand": "Pedigree",
  "product name": "Dentastix Chewy Chunks Small/Medium Chicken Flavor Dog Treats"
}

Part of

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: To discuss and validate
Status: 💬 To discuss and validate
Development

No branches or pull requests

2 participants