Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal Compiler Error with InceptionResnetV2 #241

Closed
salonit opened this issue Oct 23, 2020 · 9 comments
Closed

Internal Compiler Error with InceptionResnetV2 #241

salonit opened this issue Oct 23, 2020 · 9 comments
Assignees
Labels

Comments

@salonit
Copy link

salonit commented Oct 23, 2020

Hi I have created a classifier with Keras's InceptionResnetV2 and getting the following error:

Edge TPU Compiler version 14.1.317412892

Internal compiler error. Aborting!

When I trained the model with a smaller subset of the same dataset I am being able to successfully compile the model whereas when the model is trained with whole dataset it gives the above error.. Any help would be appreciated.

@Namburger
Copy link

Namburger commented Oct 23, 2020

@salonit Hello, so the only difference between the two model is the training dataset? No changes to the architecture?
Could you attach both models here?

@salonit
Copy link
Author

salonit commented Oct 23, 2020

Yes @Namburger the only difference between the two models is the training dataset and no changes to the architecture

These are the links to the models:
model1
model2

@Namburger
Copy link

@Naveen-Dodda could you take a look?

@ludgerh
Copy link

ludgerh commented Oct 27, 2020

I have done extensive testing of compilation with many trained models from keras.applications, all used headless combined with my own two connection layers on the top. I found that:

  • Small models are very likely to go through nicely. Off course, with low quality of results.
  • Very large models (like NASNet large for example) never make it.
  • Moderate size models (like InceptionResnetV2) are being compiled sometimes with working models for Edge TPU. This means: Even testing the same model can lead to different results. :-(
  • The probability of failure seams to rise with ongoing training.
  • Large input data shapes (above 300 x 300) make failure more likely.
  • The best compromise so far between quality of the model and probability of compiler success is InceptionResnetV2
  • For now, my solution is to automaticly repeat the compilation process until the resulting file is present.

I would be happy to know of a bugfixed version of the compiler as soon as possible. If you need testing for eventual betas: I am here and would be happy to help :-)

@ludgerh
Copy link

ludgerh commented Oct 27, 2020

...I have to correct a part of what I wrote before: My slightly modified InceptionResnetV2 trains nicely, converts nicely, quantizes nicely and compiles nicely. But:
When trying to instantiate an Interpreter, I get:

Traceback (most recent call last):
  File "test.py", line 6, in <module>
    experimental_delegates=[tflite.load_delegate('libedgetpu.so.1')])
  File "/home/ludger/safe/sources/django/env/lib/python3.7/site-packages/tflite_runtime/interpreter.py", line 204, in __init__
    model_path, self._custom_op_registerers))
ValueError: Found too many dimensions in the input array of operation 'reshape'.

The code to reproduce:

import tflite_runtime.interpreter as tflite

modelpath = '/home/ludger/sftp/c_model_2/model/cam-ai_model_quant.tflite'
mymodel = tflite.Interpreter(
	model_path=modelpath)

If I am right, I now hit issue #74 . Correct?

@ludgerh
Copy link

ludgerh commented Oct 27, 2020

Here is my model before compiling (Upload did not work because of Size):
https://cam-ai.de/InceptionResNetV2.zip

@ludgerh
Copy link

ludgerh commented Oct 27, 2020

It gets more interesting every minute. With repeated attempts to compile this quantized model I get either

  • Nothing. Just the error message @salonit mentioned before.
  • A converted file, 5 times the size of the original, that will not initiate an tflite interpreter.
  • A proper, working model for the edge TPU

Maybe it is better to put the edge TPU back into the box and wait for the compiler update?

@ludgerh
Copy link

ludgerh commented Nov 2, 2020

This fix works for me and my keras-model based on inception_resnet_v2. 100 % so far...

import tflite_runtime.interpreter as tflite
from os import path, remove

print('++++++ Compiling... ++++++')

testmodelpath = myschool.dir+'model/cam-ai_model_quant_edgetpu.tflite'
if path.exists(testmodelpath):
	remove(testmodelpath)
cmd = ['edgetpu_compiler', myschool.dir+'model/cam-ai_model_quant.tflite', '-o', myschool.dir+'model/']
while True:
	model_ok = True
	subprocess.call(cmd)
	try:
		flite.Interpreter(model_path=testmodelpath)
	except:
		print('Model not OK, retrying...')
		model_ok = False
	if model_ok:
		print('Model OK.')
		break

@Naveen-Dodda
Copy link

Please reopen if you have more issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants