Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model/class names not lined up + some classes are missing FDC data ("Egg Tart", "Fries", "Hamimelon") #39

Open
mrdbourke opened this issue Mar 18, 2022 · 3 comments

Comments

@mrdbourke
Copy link
Owner

Some classes are missing FDC data and will have to be fixed later on.

Need a way to:

  • Know what classes a model has been trained on
  • Sync up model classes with food data (the data from the FDC)
  • Only publish models that have accompanying food data with them

This will solve the problem of someone taking a photo of something an data not being displayed.

Or...

  1. Create a model with X amount of classes
  2. Make dummy FDC data for the classes that don't have it yet
  3. Display information for which classes have data and which classes don't
@mrdbourke mrdbourke changed the title Some classes are missing FDC data Some classes are missing FDC data ("Egg Tart", "Fries", "Hamimelon") Mar 18, 2022
@mrdbourke
Copy link
Owner Author

These classes will have to be fixed up within the next iteration of the dataset...

I've put dummy fdc_id codes in for them for now (the actual codes come from the FDC database) - https://fdc.nal.usda.gov/

These codes are:

dummy_ids = { 111111: 'Egg tart', # not found in FDC database
111112: 'Fries', # duplicate class in the dataset (see 'French fries')
111113: 'Hamimelon'} # not found in FDC database

The full fdc_id code list is here:

# Note: {'Egg tart', 'Fries', 'Hamimelon'} are all dummy codes to prevent bugs for now (they will error at some point)
fdc_ids = {
    1750339: 'Apple',
    169236: 'Artichoke',
    171705: 'Avocado',
    1103307: 'BBQ sauce',
    749420: 'Bacon',
    167533: 'Bagel',
    1105314: 'Banana',
    746763: 'Beef',
    1104393: 'Beer',
    171711: 'Blueberries',
    325871: 'Bread',
    747447: 'Broccoli',
    790508: 'Butter',
    169975: 'Cabbage',
    167990: 'Candy',
    746770: 'Cantaloupe',
    746764: 'Carrot',
    328637: 'Cheese',
    171719: 'Cherry',
    173630: 'Chicken wings',
    1104406: 'Cocktail',
    170169: 'Coconut',
    1104137: 'Coffee',
    333008: 'Cookie',
    167537: 'Corn chips',
    170857: 'Cream',
    168409: 'Cucumber',
    172756: 'Doughnut',
    1101515: 'Dumpling',
    171287: 'Egg',
    111111: 'Egg tart',
    169228: 'Eggplant',
    333374: 'Fish',
    170698: 'French fries',
    111112: 'Fries',
    1104647: 'Garlic',
    173040: 'Grape',
    174673: 'Grapefruit',
    321611: 'Green beans',
    170006: 'Green onion',
    1102734: 'Guacamole',
    170693: 'Hamburger',
    111113: 'Hamimelon',
    169640: 'Honey',
    167575: 'Ice cream',
    1102667: 'Kiwi fruit',
    167746: 'Lemon',
    746769: 'Lettuce',
    168155: 'Lime',
    174208: 'Lobster',
    169910: 'Mango',
    171638: 'Meat ball',
    746782: 'Milk',
    172765: 'Muffin',
    1999629: 'Mushroom',
    168914: 'Noodles',
    323294: 'Nuts',
    169260: 'Okra',
    748608: 'Olive oil',
    169095: 'Olives',
    1104962: 'Onion',
    746771: 'Orange',
    2003597: 'Orange juice',
    175009: 'Pancake',
    169926: 'Papaya',
    168927: 'Pasta',
    1104913: 'Pastry',
    325430: 'Peach',
    746773: 'Pear',
    170108: 'Pepper',
    175020: 'Pie',
    169124: 'Pineapple',
    173292: 'Pizza',
    169949: 'Plum',
    169134: 'Pomegranate',
    167959: 'Popcorn',
    170026: 'Potato',
    1099155: 'Prawns',
    169064: 'Pretzel',
    168448: 'Pumpkin',
    169276: 'Radish',
    169977: 'Red cabbage',
    168930: 'Rice',
    1103408: 'Salad',
    746775: 'Salt',
    1103330: 'Sandwich',
    746779: 'Sausages',
    174852: 'Soft drink',
    1999632: 'Spinach',
    1102056: 'Spring rolls',
    746762: 'Steak',
    747448: 'Strawberries',
    1102350: 'Sushi',
    174144: 'Tea',
    1999634: 'Tomato',
    170054: 'Tomato sauce',
    175038: 'Waffle',
    167765: 'Watermelon',
    174837: 'Wine',
    169291: 'Zucchini'
}

@mrdbourke
Copy link
Owner Author

Update: Removed "fries" and "pastry" and added back "chicken" and "squid".

ID's are now inline with the classes the model was trained on.

fdc_ids = {
    1750339: 'Apple',
    169236: 'Artichoke',
    171705: 'Avocado',
    1103307: 'BBQ sauce',
    749420: 'Bacon',
    167533: 'Bagel',
    1105314: 'Banana',
    746763: 'Beef',
    1104393: 'Beer',
    171711: 'Blueberries',
    325871: 'Bread',
    747447: 'Broccoli',
    790508: 'Butter',
    169975: 'Cabbage',
    167990: 'Candy',
    746770: 'Cantaloupe',
    746764: 'Carrot',
    328637: 'Cheese',
    171719: 'Cherry',
    111110: 'Chicken',
    173630: 'Chicken wings',
    1104406: 'Cocktail',
    170169: 'Coconut',
    1104137: 'Coffee',
    333008: 'Cookie',
    167537: 'Corn chips',
    170857: 'Cream',
    168409: 'Cucumber',
    172756: 'Doughnut',
    1101515: 'Dumpling',
    171287: 'Egg',
    111111: 'Egg tart',
    169228: 'Eggplant',
    333374: 'Fish',
    170698: 'French fries',
    1104647: 'Garlic',
    173040: 'Grape',
    174673: 'Grapefruit',
    321611: 'Green beans',
    170006: 'Green onion',
    1102734: 'Guacamole',
    170693: 'Hamburger',
    111113: 'Hamimelon',
    169640: 'Honey',
    167575: 'Ice cream',
    1102667: 'Kiwi fruit',
    167746: 'Lemon',
    746769: 'Lettuce',
    168155: 'Lime',
    174208: 'Lobster',
    169910: 'Mango',
    171638: 'Meat ball',
    746782: 'Milk',
    172765: 'Muffin',
    1999629: 'Mushroom',
    168914: 'Noodles',
    323294: 'Nuts',
    169260: 'Okra',
    748608: 'Olive oil',
    169095: 'Olives',
    1104962: 'Onion',
    746771: 'Orange',
    2003597: 'Orange juice',
    175009: 'Pancake',
    169926: 'Papaya',
    168927: 'Pasta',
    325430: 'Peach',
    746773: 'Pear',
    170108: 'Pepper',
    175020: 'Pie',
    169124: 'Pineapple',
    173292: 'Pizza',
    169949: 'Plum',
    169134: 'Pomegranate',
    167959: 'Popcorn',
    170026: 'Potato',
    1099155: 'Prawns',
    169064: 'Pretzel',
    168448: 'Pumpkin',
    169276: 'Radish',
    169977: 'Red cabbage',
    168930: 'Rice',
    1103408: 'Salad',
    746775: 'Salt',
    1103330: 'Sandwich',
    746779: 'Sausages',
    174852: 'Soft drink',
    1999632: 'Spinach',
    1102056: 'Spring rolls',
    746762: 'Steak',
    747448: 'Strawberries',
    111112: 'Squid',
    1102350: 'Sushi',
    174144: 'Tea',
    1999634: 'Tomato',
    170054: 'Tomato sauce',
    175038: 'Waffle',
    167765: 'Watermelon',
    174837: 'Wine',
    169291: 'Zucchini'
}

@mrdbourke mrdbourke changed the title Some classes are missing FDC data ("Egg Tart", "Fries", "Hamimelon") Model/class names not lined up + some classes are missing FDC data ("Egg Tart", "Fries", "Hamimelon") Mar 22, 2022
@mrdbourke
Copy link
Owner Author

This is still an issue, even with the latest commit - 88ef839

Need to put in some testing code to make sure the classes the model is trained on appears in the FDC ID's list and vice versa.

Or at least some way to line up the model classes along with the nutrient classes.

E.g.

# Pseudocode for checking for equality
model_classes = [1, 2, 3, 4...100]
fdc_id_classes = [1, 2, 3, 4...100]

if model_classes == fdc_id_classes:
    deploy
else:
    error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant