Model is too complex #21

nickmitchko · 2019-03-10T21:42:21Z

Hi,

Disclaimer: the below suggestions are in a effort to reduce this model size to where I can run it on my RPI3. My ML experience is solely in emotion detection.

I am looking over this code, because it's awesome, and looking at your model I believe it is too complex for the task at hand.

Too many convolutions: I see 23 convolutions and a respective number of pooling layers. This is too many, you are wasting too much space on useless data. Check the activation of your neurons as a percentage of the model size, you might be surprised how few are actually used.
Too many filters on convolutions: Many of the model convolutions have 512-1024 filters. This is way too many in my opinion -- past 32 or 64 filters is more or less useless at extracting meaningful data. I mean just look this Example Sigma = 6 convolution
Too large of an input picture: This is a harder problem to solve, but a more efficient model will arise when your input dimensions are smaller. I have not looked thoroughly through the code, however, you may want to explore recognizing the nozzle and carriage and take a fixed image bounding around the estimated nozzle location. This would be done through classical computer vision as opposed to machine learning. Or just follow the GCODE around and estimate the printer head location / dimensions.

If I have some free time this week I will look at the code and see if I can 'minify' the model and get it running on the pi.. That would be awesome and could be something you charge extra for as part of the service.

kennethjiang · 2019-03-10T23:46:25Z

Hi @nickmitchko thank you for submitting this issue. It'll be awesome if you can simplify the model to a point that it can run on a Pi with comparable performance. We know that it'll get much better adoption if we can make it run on a Pi.

We are not too worried about not being able to charge $ if this model runs on a Pi. The first thing we want is to bring a good tool to 3D printing community. Worst case is we have to find another source of income for ourselves which is not the end of the world.

nickmitchko · 2019-03-11T01:23:27Z

Thanks for the quick reply. I'd love to help.

What sort of environment are you using to train the default model? Are you cloud hosted or on a dedicated machine? Also, do you have a data-set available for training? I was in the process of writing a simple search scraper and pulling some images from various providers, however, if there is an existing set available that would be a better start. I saw some sample images from the octoprint plugin you have---is that the training set?.

Also, I'd like to produce different input sized models so the user can choose the performance v accuracy trade off. e.g:

64x64 - O(n)
128x128 - ~O(4n) calculation speed
... and so on

kennethjiang · 2019-03-11T03:36:36Z

We trained our current model on a GPU VM in the cloud.

For the training data set, I'll share a link for all the original timelapses for the ones on https://app.thespaghettidetective.com/publictimelapses/ as they are all the ones users have explicitly authorized us to share publicly. We do have some additional training data that users have shared with us but not authorized to share publicly yet. So we can't share them for now.

A scraper to pull images publicly online is a good idea. We are also trying to come up with a plan to get TSD's user to authorize us to use their anonymized data as training data. Hopefully we can come up with a term of use that most users feel comfortable with.

kennethjiang · 2019-03-11T13:47:57Z

The original timelapses that are authorized to share publicly can be found at: https://drive.google.com/drive/folders/1IpGAPbiHYDJFsTtlBEktLGq6niTvB_BJ?usp=sharing

nickmitchko · 2019-03-12T23:50:56Z

There's a lot there. I'll try to make some time this upcoming week to work on this :)

kennethjiang · 2019-08-22T12:26:20Z

@nickmitchko Do you still want to do anything about this issue? At this moment we don't see viable options to make model simpler. If you haven't had a lot of luck on your end we can close this ticket.

nickmitchko · 2019-08-22T17:01:33Z

You can close this issue -- I was able to actually train a much smaller model with nolearn and then tensorflow libraries (easier to get working on the RPI) from those sample images you sent.

However, I didn't publish this code / trained model due to the nature of the business model you are providing. In the end the accuracy took a hit (about 92% training/validation an ~85% in practice using bad gcode). If you are interested, I can send you the architecture I used for the model.

(Edited for spelling)

kennethjiang · 2019-08-22T18:28:05Z

Sounds awesome! Will love to learn from what you did as well as to share our learnings/experiences. Please send email to k@thespaghettidetective.com so that we can set up something to connect (maybe a video chat?) @nickmitchko

hongkongkiwi · 2019-09-19T11:38:20Z

@nickmitchko I think it's a bit unfair to open an issue, take up the developers time and benefit from their work without contributing back.

Can you share your model with us? For myself, I would also love to run it on a Pi. It does not mean I won't help the developers I think this project is excellent but since you got a working solution with help from this project, I think it would be great to share back.

nickmitchko · 2019-10-02T21:14:37Z

Hi @hongkongkiwi,

I'm glad you're interested. I will talk to @kennethjiang about getting this working on the pi and the model I trained. I travel for aliving and among other things in my life I haven't had the time to follow up on this.

I'm currently traveling, once I get back home (next week), I can share what I've got.

voidbrain · 2020-02-21T00:13:25Z

@nickmitchko any news?

kennethjiang · 2020-02-29T02:48:36Z

Closing it due to inactivity

kennethjiang mentioned this issue Mar 25, 2019

Explain minimum requirements #38

Closed

kennethjiang closed this as completed Aug 22, 2019

kennethjiang reopened this Aug 22, 2019

kennethjiang closed this as completed Feb 29, 2020

hbrylkowski mentioned this issue Nov 8, 2020

Making dataset available #310

Closed

smartin015 mentioned this issue Mar 7, 2022

[Feature] Basic model training guide #589

Open

e-fominov mentioned this issue May 4, 2023

[Feature] Alternative runtime for AI model #787

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model is too complex #21

Model is too complex #21

nickmitchko commented Mar 10, 2019

kennethjiang commented Mar 10, 2019

nickmitchko commented Mar 11, 2019

kennethjiang commented Mar 11, 2019

kennethjiang commented Mar 11, 2019

nickmitchko commented Mar 12, 2019

kennethjiang commented Aug 22, 2019

nickmitchko commented Aug 22, 2019 •

edited

Loading

kennethjiang commented Aug 22, 2019

hongkongkiwi commented Sep 19, 2019

nickmitchko commented Oct 2, 2019

voidbrain commented Feb 21, 2020

kennethjiang commented Feb 29, 2020

Model is too complex #21

Model is too complex #21

Comments

nickmitchko commented Mar 10, 2019

kennethjiang commented Mar 10, 2019

nickmitchko commented Mar 11, 2019

kennethjiang commented Mar 11, 2019

kennethjiang commented Mar 11, 2019

nickmitchko commented Mar 12, 2019

kennethjiang commented Aug 22, 2019

nickmitchko commented Aug 22, 2019 • edited Loading

kennethjiang commented Aug 22, 2019

hongkongkiwi commented Sep 19, 2019

nickmitchko commented Oct 2, 2019

voidbrain commented Feb 21, 2020

kennethjiang commented Feb 29, 2020

nickmitchko commented Aug 22, 2019 •

edited

Loading