Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

various imagecreatefromjpeg errors - causing crash - rather than skipping image #149

Closed
james-cook opened this issue Jun 12, 2019 · 24 comments

Comments

@james-cook
Copy link

james-cook commented Jun 12, 2019

These errors:

Error	PHP	imagecreatefromjpeg(): gd-jpeg: JPEG library reports unrecoverable error: Unsupported marker type 0xf7 at /var/www/nextcloud/lib/private/legacy/image.php#560		2019-06-12T11:52:21+0100
Error	PHP	imagecreatefromjpeg(): gd-jpeg: JPEG library reports unrecoverable error: Invalid JPEG file structure: two SOI markers at /var/www/nextcloud/lib/private/legacy/image.php#560		2019-06-12T11:52:19+0100
Error	PHP	imagecreatefromjpeg(): gd-jpeg: JPEG library reports unrecoverable error: Unsupported marker type 0x45 at /var/www/nextcloud/lib/private/legacy/image.php#560		2019-06-12T11:49:05+0100
Error	PHP	imagecreatefromjpeg(): gd-jpeg: JPEG library reports unrecoverable error: Unsupported marker type 0xf4 at /var/www/nextcloud/lib/private/legacy/image.php#560

cause face:background_job to crash.

I have the command in a while loop and this at least gets me further (not sure why - images sorted differently each time?) so I'm up to 11794 of 12046 images :)

Should be that such images don't cause a crash - just get handled in another way.
I checked one of the images it chokes on - a panorama picture taken on Samsung S7.
This file also does not display in the gallery so the nextcloud platform generally has a problem with the file (the file displays fine on W10 etc.)

Checking the file on the server:

identify 20190403_180749.jpg
20190403_180749.jpg JPEG 9472x3872 9472x3872+0+0 8-bit sRGB 23.09MB 0.000u 0:00.059
identify: Invalid SOS parameters for sequential JPEG `20190403_180749.jpg' @ warning/jpeg.c/JPEGWarningHandler/352.

@stalker314314
Copy link
Collaborator

Huh, definitively a bug, thanks for reporting. If you can send that image here or privately at @.org (b...o@k...ic.org), it would help to repro it. At worst, we should flag this image as problamatic and skip it. If you need quick mitigation, you can mark it directly in database with isProcessed=true and error='any non-null string' for that row in oc_face_recognition_images table.

And yes, we are shuffling images on each run for different reasons (evenly spreading ETA calculation, spreading faces clustering across different images, as face clustering can happen before all images are scanned, and we can pick up only one vacation folder...). But don't rely on that, it can take forever to restart it all the time to skip this image:)

@james-cook
Copy link
Author

email sent

@james-cook
Copy link
Author

james-cook commented Jun 12, 2019

FWIW this seems to be a known problem with Samsung panorama shots, see e.g.
https://developer.samsung.com/forum/thread/s7-saving-invalid-jpeg-format-for-panoramas/201/303807?boardName=SDK&startId=zzzzz~&searchSubId=0000000045

And there's a suggestion on how to clean the broken images too (haven't tried it yet)
at http://www.imagemagick.org/discourse-server/viewtopic.php?t=30320

I had a similar problem (taken with a Samsung S5) but it was gone after I cleaned up the images with IrfanView JPEG lossless conversion plugin, option "None (can be used for optimizing and cleaning)". So it indeed seems to be a data structure issue with the JPEG file. Curiously, it does not happen to all pictures I take with this phone.

I'll try cleaning before using the database option you mention

@matiasdelellis
Copy link
Owner

@james-cook
Copy link
Author

If it's fixed in the lib that would be the best solution (I'll try it after lunch)

@james-cook
Copy link
Author

In general though I think any such errors should always be gracefully handled rather than cause a crash.

@matiasdelellis
Copy link
Owner

In general though I think any such errors should always be gracefully handled rather than cause a crash.

Of course @james-cook
In general, errors should be ignored elegantly, and we will try to do it, but many times it exceeds us.

Now thanks to your report we are going to investigate it, and we will try to improve. 😄

@james-cook
Copy link
Author

Sorry if I adopted a harsh tone there.
I have to say I'm really impressed with the new facerecognition feature and I would like to congratulate you all on a really good job! :)

@james-cook
Copy link
Author

I upgraded libgd from 2.1.1 to 2.2.5
sudo apt-get --only-upgrade install libgd3
and now everything works fine!

Thanks!

@matiasdelellis
Copy link
Owner

Sorry if I adopted a harsh tone there.

Do not worry, nothing sounds bad... 😃

and now everything works fine!

Wow.. Many thanks to you.
However still we will take it into account to improve, and minimally add it to the FAQ, for future reports. 😬

@james-cook
Copy link
Author

Yep, adding to the FAQ is a great idea - it's the first place I went to originally.

@james-cook
Copy link
Author

Just to note:
I suppose the error-level coming from gd is now correct - so there is no crash.
I don't think the image is processed, just ignored.
Not tested yet.

@stalker314314
Copy link
Collaborator

  1. I tentatively added entry to FAQ
  2. For me, solution wasn't to upgrade to 2.2. I got image from James (thanks!), I do have 2.2.5 (as confirmed with var_dump(gd_info());). PHP just crashes without any error/warning. It crashes in image.php:560:imagecreatefromjpeg. This is not something we can "catch" as exception, I think. I don't see warning even with gd.jpeg_ignore_warning set to 1 nor with error_reporting(E_ALL & ~E_NOTICE);

@james-cook It would be good if you can confirm that either image is skipped or processed (not appearing again when doing face:background_job again and again), so we know at least it works for someone:)

@james-cook
Copy link
Author

james-cook commented Jun 17, 2019

The images are skipped I think.

I haven't checked to see if the individual images appear in clusters (I have 12000 files and little time due to other bugs(!)) but:

  1. the job (backgroud_job) does now "complete" - before it always choked on the first "corrupt" image - and then stopped.
  2. in the preview generator there are now "exceptions" in the log, but the images do not appear.

So IMHO all the new gd leads to is "correct" error levels avoiding a hard crash. I'm not sure I'd recommend an upgrade right now - as I'm having another problem related (it seems) to imagemagick / gd conflicts elsewhere in Nextcloud...

@james-cook
Copy link
Author

james-cook commented Jun 17, 2019

Looking at the DB I see (for the file I sent to Branko):

| fileid | storage | path                                                   | path_hash                        | parent | name                | mimetype | mimepart | size    | mtime      | storage_mtime | encrypted | unencrypted_size | etag                             | permissions | checksum | id   | user    | file   | model | is_processed | error                                         | last_processed_time | processing_duration |
+--------+---------+--------------------------------------------------------+----------------------------------+--------+---------------------+----------+----------+---------+------------+---------------+-----------+------------------+----------------------------------+-------------+----------+------+---------+--------+-------+--------------+-----------------------------------------------+---------------------+---------------------+
| 280672 |       2 | files/InstantUpload/Camera/2019/04/20190408_173619.jpg | 8ac4bb652e688ae34fcbf4f64e616e20 | 279310 | 20190408_173619.jpg |        4 |        3 | 7205118 | 1554918530 |    1554918530 |         0 |                0 | a933dafa7164ef6ddbf0bb2ce1a93668 |          27 |          | 2694 | xxxxxxx | 280672 |     1 |            1 | Image is not valid, probably cannot be loaded | 2019-06-12 12:48:21 | 

@james-cook
Copy link
Author

Today I'll try the script at: https://gist.github.com/bcyrill/e59fda6c7ffe23c7c4b08a990804b269 (as mentioned in one of the issues above). I'll see how it goes with these panorama files...

@stalker314314
Copy link
Collaborator

Well, judging by DB content and your description of background_job, I would say problem is "solved" on our side ("solved" in broadest sense of that word:D). Since I referenced this issue in FAQ, feel free to dump whatever information you have on that Python script or whatever you else think can solve problem in here (for next poor desperate soul having same issue:D)

@james-cook
Copy link
Author

james-cook commented Jun 18, 2019

Ok, these are the findings:

+--------+-----------------------------------------------------------------------+-------+--------------+-----------------------------------------------+
| fileid | path                                                                  | model | is_processed | error                                         |
+--------+-----------------------------------------------------------------------+-------+--------------+-----------------------------------------------+
| 292059 | files/Test/T1/in_pano.jpg                                             |     1 |            1 | NULL                                          |
| 292053 | files/Test/T1/in.jpg                                                  |     1 |            1 | Image is not valid, probably cannot be loaded |


in.jpg is the original S7 panorama picture, in_pano.jpg file is the output from the python script.

As we can see the script does a successful repair!

Some side notes:

  • The latest ImageMagick 7.0.8-49 no longer reports an "SOS" error with identify - (before I had 6.8.8-9) on these files.

Questions(!):

  1. how to RE-process the existing panorama files? Am I right in thinking I should just set is_processed back to 0 for these files (after having corrected them on the filesystem) and then re-run occ face:background_job?
  2. Related to 1. above, originally I tried deleting the entry from oc_face_recognition_images AND oc_filecache and then running file:scan --all but this did not lead to new entries in the oc_face_recognition_images table as I expected. Any thoughts on this?

More general question: I'm a bit confused as to when Nextcloud uses GD and/or Imagemagick. Is there a way to force one or the other or in general control which tool is used when (background is some conflicts I seem to be seeing elsewhere (preview generation)):

php ImagickException:
JPEG parameter struct mismatch: library thinks size is 520, caller expects 512
http://www.pacsone.net/forum/viewtopic.php?t=1545 - recommends removing GD!

@stalker314314
Copy link
Collaborator

So, once you enable app, it is setting hooks (callbacks), so any new image added or removed is "noted". When you added in_pano.jpg, it should be written to oc_face_recognition_images (with is_processed as false). Next run of face:background_job will pick it up and process it. Based on your table, I think this is what happened already and you should not do anything! Not sure what you will get if you re-process it again, and why you want that, and it is error-prone to set this, but it boils down to this:

  • Set is_processed to 0
  • Delete all oc_face_recognition_faces that have image column pointing to this image id
  • For each face you deleted, note their person column and set oc_face_recognition_persons.is_valid to 0 (or just set all entries in oc_face_recognition_persons as invalid with UPDATE oc_face_recognition_persons SET is_valid=0

Again, image processing is made so it is idempotent, so double-think before you do this, as I don't see need for it:) I mean, if you process some image again, you will get same results:)

Regarding 2, I am not sure what you did to get into this state. Just having app enabled (and enabled for user!) should add entry to oc_face_recognition_images whenever you add new image to filesystem (if it is not external or shared with you)! file:scan will not help here and there is no way to "manually" add image if file is there and not in oc_face_recognition_images (maybe we should have command for that), only way is to do that via resetting, or maybe this can help too:

UPDATE oc_preferences SET config_value=0 WHERE user=your_username AND appid='facerecognition' AND configkey='full_image_scan_done'`

This should tell our app that initial scanning is not done and will force rescan of all images (which are in oc_filecache, but not in oc_face_recognition_images) and add them. Hope this clarifies a bit.

@matiasdelellis
Copy link
Owner

Hi @james-cook

More general question: I'm a bit confused as to when Nextcloud uses GD and/or Imagemagick. Is there a way to force one or the other or in general control which tool is used when (background is some conflicts I seem to be seeing elsewhere (preview generation)):

GD and imagemagick are two completely different libraries. There is no way to force one or the other, unless it is specifically implemented in the code.

i.e: https://www.sitepoint.com/imagick-vs-gd/

php ImagickException:
JPEG parameter struct mismatch: library thinks size is 520, caller expects 512
http://www.pacsone.net/forum/viewtopic.php?t=1545 - recommends removing GD!

In that forum, they discuss the installation problems of PacsOne server <By the way, I do not know him..>, and I understand that they do this.
In our case, we use the Nextcloud libraries, and we will limit ourselves to them...

This is the code that fail to this issue:

Also note the '@', with which they deliberately avoid any error. 😞

Well, any change in that regard, should be done in @nextcloud/server 😕

@matiasdelellis
Copy link
Owner

Hi,
I understand that this report was resolved, and I close it. Please, if necessary, open it again. 😄

@james-cook
Copy link
Author

The jpeg size mismatch error I mentioned above #149 (comment)

php ImagickException:
JPEG parameter struct mismatch: library thinks size is 520, caller expects 512

also appears in #175

I never managed to solve this issue - the whole system was too fragile - I have many nextcloud instances and changing one thing to get an app "XX" to work for one instance caused problems in other nextcloud instances that didn't even use app "XX". Some installs need Camera RAW, others elasticsearch, others facerecognition etc. etc. Some worked only with Nextcloud version X, others only with version Y. The host also serves other non-Nextcloud servers(!) - I was chasing up and solving and reintroducing the same errors time and again - mostly to do with image handling and this was using a great deal of time.
See e.g. also: nextcloud/server#12876

This prompted me to isolate particular nextcloud install "types" and place them in docker - with just the dependencies they each need. I see since my initial facerecognition experiences last summer the discussions here in "issues" are also mentioninig docker etc.

Pointers: https://github.com/nextcloud/docker/blob/master/.examples/dockerfiles/full/fpm-alpine/Dockerfile
and I particularly liked:
https://github.com/ReinerNippes/nextcloud_on_docker
for its great ansible playbook

@matiasdelellis
Copy link
Owner

Hi @james-cook
I understand that you finally solved it, but to make sure I ask you.

@james-cook
Copy link
Author

Well I kind of hijacked my own thread here even if problems are "related".
I'm trying to nail down image dependencies and weird behaviour in newer issues based on progressive docker builds, see e.g. #216 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants