update dog vision v2 notebook

mrdbourke · Apr 18, 2024 · 19b28f9 · 19b28f9
1 parent 33a3d93
commit 19b28f9
Showing 1 changed file with 27 additions and 24 deletions.
diff --git a/section-4-unstructured-data-projects/end-to-end-dog-vision-v2.ipynb b/section-4-unstructured-data-projects/end-to-end-dog-vision-v2.ipynb
@@ -5028,11 +5028,6 @@
     "\n",
     "You can then used those extracted features and further tailor them to your own use case.\n",
     "\n",
-    "<img src=\"https://github.com/mrdbourke/zero-to-mastery-ml/blob/master/images/unstructured-data-our-dog-vision-model.png?raw=true\" width=750 alt=\"\n",
-    "A diagram illustrating a dog breed classification model using transfer learning. It shows input data comprising various dog images, which are fed into an EfficientNetB0 architecture, pretrained on ImageNet and kept frozen. The final layers are customized, featuring a linear classifier layer with an output corresponding to 120 classes (dog breeds), depicted by dog illustrations with a check mark indicating successful classification. The diagram cites sources for the EfficientNetB0 model and its implementation in TensorFlow's tf.keras.applications module.\"/>\n",
-    "\n",
-    "*Example of how we can take a pretrained model and customize it to our own use case. This kind of transfer learning workflow is often referred to as a feature extracting workflow. Note: In this image the EfficientNetB0 architecture is being demonstrated, however we're going to be using the EfficientNetV2B0 architecture which is slightly different. I've used the older architecture image from the research paper as a newer one wasn't available.*\n",
-    "\n",
     "Let's create an instance of `base_model` without a top layer.\n"
    ]
   },
@@ -5283,7 +5278,10 @@
     "\n",
     "However, a standard practice in transfer learning is to *freeze* the base layers of a model and only train the custom top layers to suit your problem.\n",
     "\n",
-    "TK image - freeze base layers, train top layers\n",
+    "<img src=\"https://github.com/mrdbourke/zero-to-mastery-ml/blob/master/images/unstructured-data-our-dog-vision-model.png?raw=true\" width=750 alt=\"\n",
+    "A diagram illustrating a dog breed classification model using transfer learning. It shows input data comprising various dog images, which are fed into an EfficientNetB0 architecture, pretrained on ImageNet and kept frozen. The final layers are customized, featuring a linear classifier layer with an output corresponding to 120 classes (dog breeds), depicted by dog illustrations with a check mark indicating successful classification. The diagram cites sources for the EfficientNetB0 model and its implementation in TensorFlow's tf.keras.applications module.\"/>\n",
+    "\n",
+    "*Example of how we can take a pretrained model and customize it to our own use case. This kind of transfer learning workflow is often referred to as a feature extracting workflow as the base layers are frozen (not changed during training) and only the top layers are trained. Note: In this image the EfficientNetB0 architecture is being demonstrated, however we're going to be using the EfficientNetV2B0 architecture which is slightly different. I've used the older architecture image from the research paper as a newer one wasn't available.*\n",
     "\n",
     "In other words, keep the patterns an existing model has learned on a similar problem (if they're good) to form a base representation of an input sample and then manipulate that base representation to suit our needs.\n",
     "\n",
@@ -8387,13 +8385,14 @@
     "\n",
     "Or the `australian_terrier`?\n",
     "\n",
-    "You'll see on the original [Stanford Dogs Dataset](http://vision.stanford.edu/aditya86/ImageNetDogs/) website that the authors reported the accuracy per class of each of the dog breeds. Their best performing class, `african_hunting_dog` achieved close to 60% accuracy.\n",
+    "You'll see on the original [Stanford Dogs Dataset](http://vision.stanford.edu/aditya86/ImageNetDogs/) website that the authors reported the accuracy per class of each of the dog breeds. Their best performing class, `african_hunting_dog` achieved close to 60% accuracy (about ~58% if I'm reading the graph correctly).\n",
     "\n",
-    "UPTOHERE\n",
+    "<img src=\"https://github.com/mrdbourke/zero-to-mastery-ml/blob/master/images/unstructured-data-stanford-dogs-dataset-results.png?raw=true\" width=750 alt=\"\n",
+    "An image displaying a chart from the Stanford Dogs Dataset Paper (2011). The chart shows a range of accuracies for different dog breeds with two sets of training data, 15 and 100 instances, represented in purple and red bars, respectively. There are annotations indicating a mean accuracy of 22% and a maximum accuracy of approximately 58%. A text box suggests that replicating or improving a research paper is a good practice for machine learning and AI, stating 'Our goal: Let's try and beat it!' The chart and annotations are surrounded by a dotted green line with arrows pointing to the key areas.\"/>\n",
     "\n",
-    "TK - image of results from Stanford Dogs paper/dataset\n",
+    "*Results from the original Stanford Dogs Dataset paper (2011). Let's see if the model we trained performs better than it.*\n",
     "\n",
-    "How about we try and replicate the same plot?\n",
+    "How about we try and replicate the same plot with our own results?\n",
     "\n",
     "First, let's create a DataFrame with information about our test predictions and test samples.\n",
     "\n",
@@ -9460,7 +9459,7 @@
     "id": "8YX2KDkUVXx5"
    },
    "source": [
-    "### TK - Finding the most wrong examples\n",
+    "### Finding the most wrong examples\n",
     "\n",
     "A great way to inspect your models errors is to find the examples where the prediction had a high probability but the prediction was wrong.\n",
     "\n",
@@ -9882,7 +9881,8 @@
     "\n",
     "random_most_wrong_indexes = top_100_most_wrong.sample(n=10).index\n",
     "\n",
-    "# TK - this is why we don't shuffle the test data\n",
+    "# Iterate through test results and plot them\n",
+    "# Note: This is why we don't shuffle the test data, so that it's in original order when we evaluate it.\n",
     "fig, axes = plt.subplots(2, 5, figsize=(15, 7))\n",
     "for i, ax in enumerate(axes.flatten()):\n",
     "  target_index = random_most_wrong_indexes[i]\n",
@@ -9927,7 +9927,7 @@
     "id": "k9JS1CbnYBqO"
    },
    "source": [
-    "### TK - Create a confusion matrix\n",
+    "### Create a confusion matrix\n",
     "\n",
     "A confusion matrix helps to visualize which classes a predicted compared to which classes it should've predicted (truth vs. predictions).\n",
     "\n",
@@ -9999,7 +9999,7 @@
     "id": "dXOCxxOJPeDL"
    },
    "source": [
-    "## TK - 11. Save and load the best model\n",
+    "## 11. Save and load the best model\n",
     "\n",
     "We've covered a lot of ground from loading data to training and evaluating a model.\n",
     "\n",
@@ -10127,7 +10127,7 @@
     "id": "dkW_gkAmQUJB"
    },
    "source": [
-    "## TK - 12. Make predictions on custom images with the best model\n",
+    "## 12. Make predictions on custom images with the best model\n",
     "\n",
     "Now what fun would it be if we only made predictions on the test dataset?\n",
     "\n",
@@ -10182,7 +10182,7 @@
     }
    ],
    "source": [
-    "# TK - load custom image(s)\n",
+    "# Download a set of custom images from GitHub and unzip them\n",
     "!wget -nc https://github.com/mrdbourke/zero-to-mastery-ml/raw/master/images/dog-photos.zip\n",
     "!unzip dog-photos.zip"
    ]
@@ -10800,7 +10800,9 @@
     "\n",
     "And then only letting our Dog Vision 🐶👁 model predict on the images that are of dogs.\n",
     "\n",
-    "TK image - Nutrify using a food/not food model to only allow predictions on foods\n",
+    "<img src=\"https://github.com/mrdbourke/zero-to-mastery-ml/blob/master/images/unstructured-data-combining-models-for-deployment.png?raw=true\" width=750 alt=\"Two smartphones displaying apps that differentiate between food and non-food items. The left phone shows an app with a message 'No Food Detected' over an image of a computer mouse. The right phone displays a different app with a message 'Food Detected' above a photo of a breakfast plate with a Danish and a cup of coffee. Text emphasizes the need to combine models or workflows for accurate detection.\"/>\n",
+    "\n",
+    "*Example of combining multiple machine learning models to create a workflow. One model for detecting food (Food Not Food) and another model for identifying what food is in the image (FoodVision, similar to Dog Vision). If an app is designed to take photos of food, taking photos of objects that aren't food and having them identified as food can be a poor customer experience. Source: [Nutrify](https://nutrify.app/).*\n",
     "\n",
     "These are some of the workflows you'll have to think about when you eventually deploy your own machine learning models.\n",
     "\n",
@@ -10817,7 +10819,7 @@
     "id": "BXhsZtsUiF3J"
    },
    "source": [
-    "## TK - 13. Key Takeaways\n",
+    "## 13. Key Takeaways\n",
     "\n",
     "* **Data, data, data!** In any machine learning problem, getting a dataset and preparing it so that it is in a usable format will likely be the first and often most important step (hence why we spent so much time getting the data ready). It will also be an ongoing process, as although we've worked with thousands of dog images, our models could still be improved. And as we saw going from training with 10% of the data to 100% of the data, one of the best ways to improve a model is with more data. Explore your data early and often.\n",
     "* **When starting out, use transfer learning where possible.** For most new problems, you should generally look to see if a pretrained model exists and see if you can adapt it to your use case. Ask yourself: What format is my data in? What are my ideal inputs and outputs? Is there a pretrained model for my use case?\n",
@@ -10831,7 +10833,7 @@
     "id": "jy8dvVj5oEXF"
    },
    "source": [
-    "## TK - Extensions & Exercises\n",
+    "## Extensions & Exercises\n",
     "\n",
     "The following are a series of exercises and extensions which build on what we've covered throughout this module.\n",
     "\n",
@@ -10845,15 +10847,16 @@
     "4. For more advanced model training, you may want to look into the concept of \"Callbacks\", these are functions which run during the model training. TensorFlow and Keras have a series of built-in callbacks which can be helpful for training. Have a read of the [`tf.keras.callbacks.Callback`](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/Callback) documentation and see which ones may be useful to you.\n",
     "5. We touched on the concept of overfitting when we trained our model. This is when a model performs far better on the training set than on the test set. The concept of trying to prevent overfitting is known as regularization. Spend 20-minutes researching \"ways to prevent overfitting\" and write a list of 2-3 techniques and how they might come into play with our model training. Tip: One of the most common regularization techniques in computer vision is [data augmentation](https://www.tensorflow.org/tutorials/images/data_augmentation) (also see the brief example below).\n",
     "6. One of the most important parts of machine learning is having good data. The next most important part is loading that data in a way that can used to train models as fast and efficiently as possible. For more on this, I'd highly recommend reading more about the [`tf.data` API](https://www.tensorflow.org/guide/data) (this API is TensorFlow focused, however, the concepts can be bridged to other dataloading needs) as well as reviewing the [`tf.data` best practices](https://www.tensorflow.org/guide/data_performance) (better performance with the `tf.data` API).\n",
-    "7. Right now our model works well, however, we have to write code to interact with it. You could turn it into a small machine learning app using [Gradio](https://www.gradio.app/) so people can upload their own images of dogs and see what the model predicts. See the [example for image classification with TensorFlow and Keras](https://www.gradio.app/guides/image-classification-in-tensorflow) for an idea of what you could build. TK - show example of Gradio app running with Dog Vision model\n",
+    "7. Right now our model works well, however, we have to write code to interact with it. You could turn it into a small machine learning app using [Gradio](https://www.gradio.app/) so people can upload their own images of dogs and see what the model predicts. See the [example for image classification with TensorFlow and Keras](https://www.gradio.app/guides/image-classification-in-tensorflow) for an idea of what you could build. \n",
+    "\n",
+    "UPTOHERE, create dog vision app\n",
+    "TK - show example of Gradio app running with Dog Vision model\n",
     "\n",
     "In this project we've only really scratched the surface of what's possible with TensorFlow/Keras and deep learning.\n",
     "\n",
     "For a more comprehensive overview of TensorFlow/Keras, see the following:\n",
     "* [14-hour TensorFlow Tutorial on YouTube](https://youtu.be/tpCFfeUEGs8?si=HqWQ9BdLkV6YWgcF) (this is the first 14 hours of the ZTM TensorFlow course).\n",
-    "* [Zero to Mastery TensorFlow for Deep Learning course](https://dbourke.link/ZTMTFcourse) (a 50+ hour course diving into many applications of TensorFlow and deep learning).\n",
-    "\n",
-    "\n"
+    "* [Zero to Mastery TensorFlow for Deep Learning course](https://dbourke.link/ZTMTFcourse) (a 50+ hour course diving into many applications of TensorFlow and deep learning).\n"
    ]
   },
   {
@@ -10862,7 +10865,7 @@
     "id": "3K5fwoN4qE9n"
    },
    "source": [
-    "### TK - Extension example: data augmentation\n",
+    "### Extension example: data augmentation\n",
     "\n",
     "Data augmentation is a regularization technique to help prevent overfitting.\n",
     "\n",