You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 16, 2020. It is now read-only.
Lines extracted with tools from OCR-D show wrong vertical offsets (see OCR-D/format-converters#16). The PRImA page viewer shows that this PAGE file has lots of word boxes which don't cover the word. They are even outside of the corresponding line. There are also lines outside of their text region. Such basic errors should be made impossible by the Transkribus user interface.
The text was updated successfully, but these errors were encountered:
Those effects may occur when editing transcriptions on line basis - then word coordinates are not automatically synced.
Also, there are no checks whether lines or words are overlapping with their parent shapes, true. In comparison to the PRImA utils, Transkribus is taking a more liberal approach here, mostly because the primary focus is on creating GT for HTR which only takes into account the baselines and the corresponding text.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The NZZ PAGE XML file was created by Transkribus.
Lines extracted with tools from OCR-D show wrong vertical offsets (see OCR-D/format-converters#16). The PRImA page viewer shows that this PAGE file has lots of word boxes which don't cover the word. They are even outside of the corresponding line. There are also lines outside of their text region. Such basic errors should be made impossible by the Transkribus user interface.
The text was updated successfully, but these errors were encountered: