Dashboard creation for data exploration #72

mrdbourke · 2023-02-13T23:03:48Z

Create a dashboard to view and interact with different statistics about the data.

This could be built using Streamlit.

For example, input the annotations + results + more, then:

Show statistics about number of images in each class
Show examples of where model is failing/doing well
Be able to easily see what classes the model is performing on
Perhaps be able to compare across experiments the results a model has for different classes?
- For example, select two experiments, show the lineage, then compare them across the performance of each class?

Should at all times be able to explore the data and see where the model is not performing well...

mrdbourke · 2023-02-20T04:33:12Z

See an example here -- https://blog.streamlit.io/how-to-build-a-real-time-live-dashboard-with-streamlit/

shivan-s · 2023-02-20T04:39:45Z

How will the data transfer from said things to the dashboard? Is there an API?

…

On Mon, 20 Feb 2023, 5:33 pm Daniel Bourke, ***@***.***> wrote: See an example here -- https://blog.streamlit.io/how-to-build-a-real-time-live-dashboard-with-streamlit/ — Reply to this email directly, view it on GitHub <#72 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMGDQM64M5PUF5MYY2WMS4TWYLXZFANCNFSM6AAAAAAU23LN3E> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

mrdbourke · 2023-02-20T04:48:27Z

@shivan-s, right now it's a static CSV but will be looking to make this via URL (e.g. straight from Google Storage or from Hugging Face Datasets etc)

"""
Streamlit dashboard for exploring data from FoodVision.

Basing off this: https://blog.streamlit.io/how-to-build-a-real-time-live-dashboard-with-streamlit/ 
"""
import pandas as pd
import streamlit as st

# Import the data
# TODO: change this to a URL that gets live tracked?/updated etc
dataset_url = "annotations.csv"

# TODO: cache this so it's saved: https://docs.streamlit.io/library/advanced-features/caching 
def get_data() -> pd.DataFrame:
    """Get the data from a CSV file.
    """
    return pd.read_csv(dataset_url)

shivan-s · 2023-02-20T05:02:43Z

Cool stuff! I'd love to have a go building this dashboard. Just need to know how the data is managed and what you want to see on the dashboard. Shivan

…

On Mon, 20 Feb 2023, 5:48 pm Daniel Bourke, ***@***.***> wrote: @shivan-s <https://github.com/shivan-s>, right now it's a static CSV but will be looking to make this via URL (e.g. straight from Google Storage or from Hugging Face Datasets etc) """Streamlit dashboard for exploring data from FoodVision.Basing off this: https://blog.streamlit.io/how-to-build-a-real-time-live-dashboard-with-streamlit/ """import pandas as pdimport streamlit as st # Import the data# TODO: change this to a URL that gets live tracked?/updated etcdataset_url = "annotations.csv" # TODO: cache this so it's saved: https://docs.streamlit.io/library/advanced-features/caching def get_data() -> pd.DataFrame: """Get the data from a CSV file. """ return pd.read_csv(dataset_url) — Reply to this email directly, view it on GitHub <#72 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMGDQM5X76KG4ONKNAWLLALWYLZSNANCNFSM6AAAAAAU23LN3E> . You are receiving this because you were mentioned.Message ID: ***@***.***>

mrdbourke · 2023-02-20T05:51:58Z

Epic! Just email you a sample dataset/labels.

Basically want an extensive EDA of the annotations for now.

So I know which labels need more work.

Can add image displays/model results later on.

shivan-s · 2023-02-22T03:36:26Z

I have your dashboard up now.

I'm trying to think of ways to give you value in the EDA process.

What do you need to see and look into?

mrdbourke · 2023-02-22T03:41:11Z

Epic!

Basically I'd like several different ways to view label counts, for example, the dashboard should answer:

How many images are there per class?
How many images are there per specific label_source?
A quick way to view images that have under 100 manual_upload as the label_source (we're trying to get all classes above 100 manually uploaded images)

These are some of the main things we're looking for.

Perhaps a better looking way to view all the class names?

E.g. a map from class_name -> label ({0: "apple_red"}) so that all class names can be viewed easily

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dashboard creation for data exploration #72

Dashboard creation for data exploration #72

mrdbourke commented Feb 13, 2023 •

edited

Loading

mrdbourke commented Feb 20, 2023

shivan-s commented Feb 20, 2023 via email

mrdbourke commented Feb 20, 2023

shivan-s commented Feb 20, 2023 via email

mrdbourke commented Feb 20, 2023

shivan-s commented Feb 22, 2023

mrdbourke commented Feb 22, 2023

Dashboard creation for data exploration #72

Dashboard creation for data exploration #72

Comments

mrdbourke commented Feb 13, 2023 • edited Loading

mrdbourke commented Feb 20, 2023

shivan-s commented Feb 20, 2023 via email

mrdbourke commented Feb 20, 2023

shivan-s commented Feb 20, 2023 via email

mrdbourke commented Feb 20, 2023

shivan-s commented Feb 22, 2023

mrdbourke commented Feb 22, 2023

mrdbourke commented Feb 13, 2023 •

edited

Loading