Compatibility with tesseract 4 #273

deepio · 2019-04-06T23:37:55Z

Audiveris: 5.1.0:6780b1f91
OCR Engine: Tesseract OCR, version 3.04.01

When will we see support for Tesseract 4.0?

The text was updated successfully, but these errors were encountered:

maximumspatium · 2019-04-07T00:37:22Z

When will we see support for Tesseract 4.0?

I'm currently working on supporting Tesseract 4.0. Unfortunately, an upgrade attempt has revealed unforeseen problems:

an old "won't fix" bug has slipped into the 4.x branch causing Java Virtual Machine to crash when accessing libtesseract. The problem is that Tesseract maintainers stubbornly refuse to fix this issue repeatedly telling "that is not our bug" and thus just breaking 3rd party software. Fortunately, Samuel Audet from Javacpp project has recently fixed it, see tesseract 4.0.0-1.4.4 crashes on Mac OS bytedeco/javacpp-presets#694
Tesseract public API seems to have undergone some changes so the way Audiveris communicate with the engine doesn't work anymore. This need to be troubleshoot and worked around.

Moreover, Tesseract's full page mode has been proven to perform rather poor on text recognition in presence of musical symbols. Especially, lyrics and chords are often affected because they use uncommon layout vs. grammar. This is something we cannot work around easily. For the time being, Audiveris let Tesseract to perform one-shot text detection and recognition relying on algorithms we have no control over. This need to be reworked to allow multistage recognition/rejection using different parameters, see #44.

maximumspatium · 2019-05-11T12:40:06Z

Update:

(JVM crash)

fixed

Tesseract public API seems to have undergone some changes so the way Audiveris communicate with the engine doesn't work anymore.

Audiveris relies on the information returned by LTRResultIterator::WordFontAttributes that includes font properties (bold, italic etc) as well as font size. All this is required by the Audiveris UI for displaying recognized text over the original picture.

Tesseract 4 has been redesigned in such a way that the font information except character size isn't available anymore, see tesseract-ocr/tesseract#1074

A support for font attributes is feasible but isn't available yet. According to the principal Tesseract developer, Ray Smith, this is one more reason for delaying deprecation of the v3 engine.

Many people recommend to stick to the old engine instead of switching to the recent one. The reality is a bit different:

most of the package managers already switched to Tesseract 4; installing the older version is somewhat difficult requiring compiling from sources
Tesseract shows an improved accuracy and faster recognition for musical scores than its predecessor

I'm currently redesigning Tesseract-related classes to support the new engine. Results will be reported shortly...

maximumspatium · 2019-05-25T12:08:07Z

After spending several days analyzing Tesseract's 4 output via TessAPI, I found out several heavy-weight problems preventing further adoption of the LSTM engine for our OMR task.
The biggest roadblock is that Tesseract 4 reports sometimes random characters with bounding box set to the whole page. This is a known issue reported by several people and still unfixed, see tesseract-ocr/tesseract#1192

I therefore decided to wait for the Tesseract team to fix all bounding box related issues first. Audiveris will stick to Tesseract 3.x for now.

maximumspatium · 2019-06-28T14:58:11Z

I just tested Tesseract 4 in the legacy engine mode (OEM_TESSERACT_ONLY). It seems to work as expected. The updated code was pushed to the tess4 feature branch.

Please test it and give me a feedback.

deepio · 2019-07-03T15:42:50Z

I can confirm that it does not crash and it produces musicxml files, but the musicxml files are almost completely empty for the handful of tiff files I tried. This is the full musicxml file output.

<?xml version="1.0" ?>
<sheet last-persistent-id="0" number="1">
  <glyph-index></glyph-index>
</sheet>

deepio · 2020-04-18T02:05:14Z

Will close this issue and create a new one for the new issue.

stweil · 2022-01-09T15:52:16Z

I just tested Tesseract 4 in the legacy engine mode (OEM_TESSERACT_ONLY). It seems to work as expected.

That's what I have expected, too. Tesseract 4 and even the latest Tesseract 5.0.1 are still compatible with Tesseract 3 in legacy mode. Why was the update abandoned? I noticed that there exist pre-built jar files which can be used for 4.0.0, but I could not find jar files for newer releases.

stweil · 2022-01-09T16:31:08Z

Now I could at least build with Tesseract 4.1.1 (based on your tess4 branch). See https://github.com/stweil/audiveris/tree/tess4.

hbitteur · 2022-01-09T18:21:24Z

@stweil
While investigating a new classifier in mid 2020 (see head-classifier branch), I used Tesseract 4.1.0 to be able to build the whole software set.

But I did not really use OCR by this time, I was focusing on a new attempt of head recognition via a patch classifier. This work is still on pause right now, it should get resurrected some day, but that's another story.
Purpose of my remark is to call your attention on the fact that the software can be built, but that says nothing about the quality of OCR recognition (vs the 3.x Tesseract engine) when applied to sparse textual elements as found on a music score.

If you could spend some time to evaluate the actual OCR results (of 4.x, and perhaps 5.x as you mentioned), we would all benefit from such experience.

stweil · 2022-01-09T18:29:46Z

In theory Tesseract 4 and 5 in legacy mode should produce identical results as Tesseract 3 because all use the same OCR engine (and the same kind of models), so the quality would be identical. Tesseract 5 would still be faster, include a lot of bug fixes and support more platforms (ARM, Apple M1, ...).

I have much experience with Tesseract, so I can help on that side. And I have no experience with Audiveris.

maximumspatium · 2022-01-09T18:44:51Z

@hbitteur Stefan asked for the reason to not merging the tess4 branch into master/development.
As for now, Audiveris still uses the ancient Tesseract 3.04, see

audiveris/build.gradle

Line 14 in 8671b09

ext.tessVersion = '3.04.01'

To my understanding, nothing prevents us from switching to the newer Tesseract 4.1 or even 5.x as long as they run in the legacy engine mode. This will require changes available in the tess4 branch because the underlying API for accessing the new OCR engine was updated several years ago.

maximumspatium · 2022-01-09T18:48:01Z

@stweil

I noticed that there exist pre-built jar files which can be used for 4.0.0, but I could not find jar files for newer releases.

Audiveris doesn't use pre-built binaries. It uses the javacpp-presets wrapper for accessing Tesseract. The recent javacpp-presets release supports Tesseract 5.0 by default.

hbitteur · 2022-01-09T19:02:41Z

@maximumspatium
Yes, let's try to move to 5.x in legacy mode

maximumspatium · 2022-01-09T19:04:18Z

I'll go ahead and switch to javacpp-presets 1.5.6 then.

stweil · 2022-01-09T19:11:07Z

I just tried Audiveris with 4.1.1, and that seems to work fine. The modifications from your tess4 branch were sufficient (I only rebased those changes to the latest code in https://github.com/stweil/audiveris/tree/tess4).

maximumspatium · 2022-04-20T22:23:52Z

@stweil I finally switched Audiveris to Tesseract 4.1.1, see ce97610

I also tried Audiveris with Tesseract 5.0.1. Unfortunately, libtesseract crashes the JVM in my macOS 10.13, probably because the binaries were compiled for macOS 10.15. I need to rebuild Javacpp-presets for my system to be able to test the recent OCR engine.

maximumspatium · 2022-04-20T22:37:45Z

I have much experience with Tesseract, so I can help on that side

@stweil We're experiencing issues with Tesseract sometimes reporting unreliable symbol positions when running in the full page mode.
Original image:

Recognition result:

Selecting the area and letting Tesseract recognize it again usually produces better results:

It looks like a bug in the Tesseract API I never managed to catch.
Any idea how to fix that?

stweil · 2022-04-22T09:18:40Z

Do you get those wrong positions also when the same page is processed by the tesseract executable?

maximumspatium · 2022-04-23T19:01:22Z

Do you get those wrong positions also when the same page is processed by the tesseract executable?

The tesseract executable reports correct symbol positions in the XML output.

I tried two different page segmentation modes and got similar results:

PSM=3, Tesseract's default, also used by Audiveris:

<TextLine ID="line_3" HPOS="139" VPOS="392" WIDTH="536" HEIGHT="39">
   <String ID="string_10" HPOS="139" VPOS="392" WIDTH="226" HEIGHT="39" WC="0.84" CONTENT="Arrangement"/><SP WIDTH="14" VPOS="392" HPOS="365"/>
   <String ID="string_11" HPOS="379" VPOS="402" WIDTH="4" HEIGHT="21" WC="0.89" CONTENT=":"/><SP WIDTH="16" VPOS="402" HPOS="383"/>
   <String ID="string_12" HPOS="399" VPOS="392" WIDTH="94" HEIGHT="31" WC="0.83" CONTENT="Alain"/><SP WIDTH="12" VPOS="392" HPOS="493"/>
   <String ID="string_13" HPOS="505" VPOS="393" WIDTH="170" HEIGHT="30" WC="0.89" CONTENT="BRUNET"/>
</TextLine>

PSM=11 i.e. "find as much test as possible":

<TextLine ID="line_2" HPOS="139" VPOS="392" WIDTH="536" HEIGHT="39">
   <String ID="string_8" HPOS="139" VPOS="392" WIDTH="226" HEIGHT="39" WC="0.82" CONTENT="Arrangement"/><SP WIDTH="14" VPOS="392" HPOS="365"/>
   <String ID="string_9" HPOS="379" VPOS="402" WIDTH="4" HEIGHT="21" WC="0.89" CONTENT=":"/><SP WIDTH="16" VPOS="402" HPOS="383"/>
   <String ID="string_10" HPOS="399" VPOS="392" WIDTH="94" HEIGHT="31" WC="0.83" CONTENT="Alain"/><SP WIDTH="12" VPOS="392" HPOS="493"/>
   <String ID="string_11" HPOS="505" VPOS="393" WIDTH="170" HEIGHT="30" WC="0.89" CONTENT="BRUNET"/>
</TextLine>

I assume a bug somewhere in the public API.

stweil · 2022-04-24T07:49:50Z

I finally switched Audiveris to Tesseract 4.1.1, see ce97610

I just wanted to try the new code, but it looks like the Javacpp-presets are unavailable for M1 MacOS.

maximumspatium · 2022-04-24T13:14:38Z

I just wanted to try the new code, but it looks like the Javacpp-presets are unavailable for M1 MacOS.

@stweil That's true. Apparently, it's very easy to adapt Javacpp-presets to an unsupported architecture. Each preset includes a build script that compiles both the native library as well as its JNI bridge. If you can compile Tesseract in your M1, you will be able to compile its Java bindings. Unfortunately, I can't do it because I don't own a M1 Mac :)
Anyway, I compiled Javacpp-presets from source several times in the past. It was pretty easy.

FYI: bytedeco/javacpp-presets#1069

maximumspatium · 2022-04-24T20:10:37Z

Let's move our discussion regarding OCR issues to #575.

saudet · 2022-05-09T13:02:41Z

FYI, macosx-arm64 builds are now available, see issue bytedeco/javacpp-presets#814.
Please give it a try with the snapshots: http://bytedeco.org/builds/

maximumspatium changed the title ~~Compatibility with tesseract~~ Compatibility with tesseract 4 Apr 22, 2019

maximumspatium mentioned this issue Jun 11, 2019

Call audiveris from python script #289

Closed

maximumspatium mentioned this issue Jun 20, 2019

Installing Audiveris on Nvidia Jetson TX2 #251

Closed

maximumspatium mentioned this issue Jul 12, 2019

Could not initialize Tesseract with lang deu+eng+fra #297

Closed

deepio closed this as completed Apr 18, 2020

hallvors mentioned this issue Mar 25, 2021

NullPointerException #467

Closed

stweil mentioned this issue Jan 9, 2022

tesseract version #489

Closed

maximumspatium reopened this Jan 9, 2022

maximumspatium closed this as completed Apr 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compatibility with tesseract 4 #273

Compatibility with tesseract 4 #273

deepio commented Apr 6, 2019

maximumspatium commented Apr 7, 2019 •

edited

Loading

maximumspatium commented May 11, 2019 •

edited

Loading

maximumspatium commented May 25, 2019 •

edited

Loading

maximumspatium commented Jun 28, 2019

deepio commented Jul 3, 2019 •

edited

Loading

deepio commented Apr 18, 2020

stweil commented Jan 9, 2022

stweil commented Jan 9, 2022

hbitteur commented Jan 9, 2022

stweil commented Jan 9, 2022 •

edited

Loading

maximumspatium commented Jan 9, 2022 •

edited

Loading

maximumspatium commented Jan 9, 2022

hbitteur commented Jan 9, 2022

maximumspatium commented Jan 9, 2022

stweil commented Jan 9, 2022

maximumspatium commented Apr 20, 2022

maximumspatium commented Apr 20, 2022

stweil commented Apr 22, 2022

maximumspatium commented Apr 23, 2022 •

edited

Loading

stweil commented Apr 24, 2022 •

edited

Loading

maximumspatium commented Apr 24, 2022 •

edited

Loading

maximumspatium commented Apr 24, 2022

saudet commented May 9, 2022

Compatibility with tesseract 4 #273

Compatibility with tesseract 4 #273

Comments

deepio commented Apr 6, 2019

maximumspatium commented Apr 7, 2019 • edited Loading

maximumspatium commented May 11, 2019 • edited Loading

maximumspatium commented May 25, 2019 • edited Loading

maximumspatium commented Jun 28, 2019

deepio commented Jul 3, 2019 • edited Loading

deepio commented Apr 18, 2020

stweil commented Jan 9, 2022

stweil commented Jan 9, 2022

hbitteur commented Jan 9, 2022

stweil commented Jan 9, 2022 • edited Loading

maximumspatium commented Jan 9, 2022 • edited Loading

maximumspatium commented Jan 9, 2022

hbitteur commented Jan 9, 2022

maximumspatium commented Jan 9, 2022

stweil commented Jan 9, 2022

maximumspatium commented Apr 20, 2022

maximumspatium commented Apr 20, 2022

stweil commented Apr 22, 2022

maximumspatium commented Apr 23, 2022 • edited Loading

stweil commented Apr 24, 2022 • edited Loading

maximumspatium commented Apr 24, 2022 • edited Loading

maximumspatium commented Apr 24, 2022

saudet commented May 9, 2022

maximumspatium commented Apr 7, 2019 •

edited

Loading

maximumspatium commented May 11, 2019 •

edited

Loading

maximumspatium commented May 25, 2019 •

edited

Loading

deepio commented Jul 3, 2019 •

edited

Loading

stweil commented Jan 9, 2022 •

edited

Loading

maximumspatium commented Jan 9, 2022 •

edited

Loading

maximumspatium commented Apr 23, 2022 •

edited

Loading

stweil commented Apr 24, 2022 •

edited

Loading

maximumspatium commented Apr 24, 2022 •

edited

Loading