UB-Mannheim/tesseract UB-Mannheim/tesseractPublic forked fromtesseract-ocr/tesseract NotificationsYou must be signed in to change notification settings Fork462 Star3.4k Apache-2.0 license starsforks NotificationsYou must be signed in to change notification settings...
Tesseract at UB Mannheim The Mannheim University Library (UB Mannheim) uses Tesseract to perform text recognition (OCR = optical character recognition) for historical German newspapers (Allgemeine Preußische Staatszeitung,Deutscher Reichsanzeiger). The latest results with text from more than 700000 pag...
for Windows seehttps://github.com/UB-Mannheim/tesseract/wiki for Linux, Mac seehttps://tesseract-ocr.github.io/tessdoc/Installation.html pdftoppmfrom poppler library is downloaded and installed some hints for the installation:https://github.com/UB-Mannheim/zotero-ocr/wiki/Install-pdftoppm ...
Whitespace Ignore whitespace Split Unified 2 changes: 1 addition & 1 deletion 2 test Submodule test updated 5 files +2 −2 langtests/README.md +6 −3 unlvtests/README.md +1 −2 unlvtests/runalltests.sh +2 −3 unlvtests/runalltests_spa.sh +2 −3 unlvtests/runtest...
Tesseract Open Source OCR Engine (main repository) - add missing commas · UB-Mannheim/tesseract@8b4284d
Tesseract Open Source OCR Engine (main repository) - Correct indefinite articles before vowels · UB-Mannheim/tesseract@7c17827
Tesseract Open Source OCR Engine (main repository) - Move bail_out function before libtoolize check · UB-Mannheim/tesseract@670672d
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader) - UB-Mannheim/ocr-fileformat
✓ Talk to tesseract ✓ Talk to ocropus ✓ Talk to abbyy ✓ Configuration files (for the whole process and every single ocr-engine) ✓ Implement cut method ✓ Create uniform output structure ✓ Create hocr-output ✓ Create logs with settings information...
</hbox> </groupbox> <groupbox> <checkbox preference="extensions.zotero.zoteroocr.outputNote" label="Save output as a note"/> <checkbox id="checkbox-zoteroocr-output-pdf" preference="extensions.zotero.zoteroocr.outputPDF" label="Save output as a PDF with text layer" oncommand="Zotero.OC...