Recognition fails on images outside the dataset #21

wzlxjtu · 2019-03-18T22:17:37Z

Hi, I'm having a question about recognizing images that are not from the dataset. I tried to crop images from scientific papers, which has a similar size to the images in the dataset. However, the output would fail totally even for simple formulas. For debugging, I also took a screenshot of the images from the dataset, but the output for the screenshot would fail, even though the original images succeeded. The screenshot and the original image are almost the same, which really confuses me. Am I missing some preprocessing steps, or do I need to re-train the model with different image sizes? Hope this question doesn't sound too naive, but I really need some help. Thank you very much!

da03 · 2019-03-18T22:23:25Z

Hi @wzlxjtu, since this model is only trained on im2latex-100k, without any font/environment variations, it is normal that neural networks would fail to generalize to other scientific domains even though the images appear very similar to humans. To get a more robust model, you might need to construct a training dataset with the same level of noise (e.g., if you want to do scientific papers, you might need to render latex in various font sizes and font families).

However, it's weird that screenshots would fail, I think you might need to rescale the screenshots to match the font size as the training images (e.g., if '\lambda' is 8-by-10 pixels in the training set, you might need to rescale such that the size remains the same in the screenshot).

da03 · 2019-03-18T22:25:11Z

btw, for the screenshots, you might also need to make sure that they are in grayscale, and downsampled by 2 if you took a screenshot of an unpreprocessed image.

wzlxjtu · 2019-03-19T21:23:47Z

Hi @da03 , I found out that it's the padding on the left and top that's playing a critical rule. The padding should be 4 pixels (as stated in your paper, 8 pixels and then downsampled by 2). After I got the padding correct, I got some output that makes sense. However, seems like the way you downsample the image is also critical for precision. I tried to linearly downsample the original images in the IM2LATEX-100K dataset but could not reproduce your preprocessed image. Take bc13232098.png for example.

Yours:

Mine:

Original:

Did you downsample the images with Gaussian filter or anything like that? Am I missing some other important preprocessing steps? I tried to find this information but seems like this step was not documented. I really appreciate your help!

da03 · 2019-03-20T00:11:23Z

Hmm interesting. I used LANCZOS resampling:
https://github.com/harvardnlp/im2markup/blob/master/scripts/utils/image_utils.py#L56

wzlxjtu · 2019-03-20T01:59:21Z

Oh! Really appreciate it!

HongChow · 2019-09-02T05:50:00Z

btw, for the screenshots, you might also need to make sure that they are in grayscale, and downsampled by 2 if you took a screenshot of an unpreprocessed image.

@da03 would you please tell me why need downsampled by 2 ?
Great thanks

da03 · 2019-09-03T00:37:14Z

It's because during preprocessing we downsampled by 2. Since deep neural networks do not work on out-of-domain data, at test time we need to do the same preprocessing. In order to get a model that's robust against resolutions or color maps, we need to add those transformations/noise during training as well.

vyaslkv · 2020-01-04T10:48:57Z

I am also facing the same issue and not getting results for the images outside the test dataset I did the preprossing step using below:
but still not getting the sensible results

onmt_preprocess -data_type img
-src_dir data/im2text/images/
-train_src data/im2text/src-train.txt
-train_tgt data/im2text/tgt-train.txt -valid_src data/im2text/src-val.txt
-valid_tgt data/im2text/tgt-val.txt -save_data data/im2text/demo
-tgt_seq_length 150
-tgt_words_min_frequency 2
-shard_size 500
-image_channel_size 1

wzlxjtu closed this as completed Mar 20, 2019

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recognition fails on images outside the dataset #21

Recognition fails on images outside the dataset #21

wzlxjtu commented Mar 18, 2019

da03 commented Mar 18, 2019

da03 commented Mar 18, 2019

wzlxjtu commented Mar 19, 2019 •

edited

Loading

da03 commented Mar 20, 2019

wzlxjtu commented Mar 20, 2019

HongChow commented Sep 2, 2019 •

edited

Loading

da03 commented Sep 3, 2019

vyaslkv commented Jan 4, 2020

Recognition fails on images outside the dataset #21

Recognition fails on images outside the dataset #21

Comments

wzlxjtu commented Mar 18, 2019

da03 commented Mar 18, 2019

da03 commented Mar 18, 2019

wzlxjtu commented Mar 19, 2019 • edited Loading

da03 commented Mar 20, 2019

wzlxjtu commented Mar 20, 2019

HongChow commented Sep 2, 2019 • edited Loading

da03 commented Sep 3, 2019

vyaslkv commented Jan 4, 2020

wzlxjtu commented Mar 19, 2019 •

edited

Loading

HongChow commented Sep 2, 2019 •

edited

Loading