Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model is too complex #21

Closed
nickmitchko opened this issue Mar 10, 2019 · 12 comments
Closed

Model is too complex #21

nickmitchko opened this issue Mar 10, 2019 · 12 comments

Comments

@nickmitchko
Copy link

Hi,

Disclaimer: the below suggestions are in a effort to reduce this model size to where I can run it on my RPI3. My ML experience is solely in emotion detection.

I am looking over this code, because it's awesome, and looking at your model I believe it is too complex for the task at hand.

  • Too many convolutions: I see 23 convolutions and a respective number of pooling layers. This is too many, you are wasting too much space on useless data. Check the activation of your neurons as a percentage of the model size, you might be surprised how few are actually used.
  • Too many filters on convolutions: Many of the model convolutions have 512-1024 filters. This is way too many in my opinion -- past 32 or 64 filters is more or less useless at extracting meaningful data. I mean just look this Example Sigma = 6 convolution
  • Too large of an input picture: This is a harder problem to solve, but a more efficient model will arise when your input dimensions are smaller. I have not looked thoroughly through the code, however, you may want to explore recognizing the nozzle and carriage and take a fixed image bounding around the estimated nozzle location. This would be done through classical computer vision as opposed to machine learning. Or just follow the GCODE around and estimate the printer head location / dimensions.

If I have some free time this week I will look at the code and see if I can 'minify' the model and get it running on the pi.. That would be awesome and could be something you charge extra for as part of the service.

@kennethjiang
Copy link
Contributor

Hi @nickmitchko thank you for submitting this issue. It'll be awesome if you can simplify the model to a point that it can run on a Pi with comparable performance. We know that it'll get much better adoption if we can make it run on a Pi.

We are not too worried about not being able to charge $ if this model runs on a Pi. The first thing we want is to bring a good tool to 3D printing community. Worst case is we have to find another source of income for ourselves which is not the end of the world.

@nickmitchko
Copy link
Author

Thanks for the quick reply. I'd love to help.

What sort of environment are you using to train the default model? Are you cloud hosted or on a dedicated machine? Also, do you have a data-set available for training? I was in the process of writing a simple search scraper and pulling some images from various providers, however, if there is an existing set available that would be a better start. I saw some sample images from the octoprint plugin you have---is that the training set?.

Also, I'd like to produce different input sized models so the user can choose the performance v accuracy trade off. e.g:

  • 64x64 - O(n)
  • 128x128 - ~O(4n) calculation speed
  • ... and so on

@kennethjiang
Copy link
Contributor

We trained our current model on a GPU VM in the cloud.

For the training data set, I'll share a link for all the original timelapses for the ones on https://app.thespaghettidetective.com/publictimelapses/ as they are all the ones users have explicitly authorized us to share publicly. We do have some additional training data that users have shared with us but not authorized to share publicly yet. So we can't share them for now.

A scraper to pull images publicly online is a good idea. We are also trying to come up with a plan to get TSD's user to authorize us to use their anonymized data as training data. Hopefully we can come up with a term of use that most users feel comfortable with.

@kennethjiang
Copy link
Contributor

The original timelapses that are authorized to share publicly can be found at: https://drive.google.com/drive/folders/1IpGAPbiHYDJFsTtlBEktLGq6niTvB_BJ?usp=sharing

@nickmitchko
Copy link
Author

There's a lot there. I'll try to make some time this upcoming week to work on this :)

@kennethjiang
Copy link
Contributor

@nickmitchko Do you still want to do anything about this issue? At this moment we don't see viable options to make model simpler. If you haven't had a lot of luck on your end we can close this ticket.

@nickmitchko
Copy link
Author

nickmitchko commented Aug 22, 2019

You can close this issue -- I was able to actually train a much smaller model with nolearn and then tensorflow libraries (easier to get working on the RPI) from those sample images you sent.

However, I didn't publish this code / trained model due to the nature of the business model you are providing. In the end the accuracy took a hit (about 92% training/validation an ~85% in practice using bad gcode). If you are interested, I can send you the architecture I used for the model.

(Edited for spelling)

@kennethjiang
Copy link
Contributor

Sounds awesome! Will love to learn from what you did as well as to share our learnings/experiences. Please send email to k@thespaghettidetective.com so that we can set up something to connect (maybe a video chat?) @nickmitchko

@hongkongkiwi
Copy link

@nickmitchko I think it's a bit unfair to open an issue, take up the developers time and benefit from their work without contributing back.

Can you share your model with us? For myself, I would also love to run it on a Pi. It does not mean I won't help the developers I think this project is excellent but since you got a working solution with help from this project, I think it would be great to share back.

@nickmitchko
Copy link
Author

Hi @hongkongkiwi,

I'm glad you're interested. I will talk to @kennethjiang about getting this working on the pi and the model I trained. I travel for aliving and among other things in my life I haven't had the time to follow up on this.

I'm currently traveling, once I get back home (next week), I can share what I've got.

@voidbrain
Copy link

@nickmitchko any news?

@kennethjiang
Copy link
Contributor

Closing it due to inactivity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants