要扫描PDF文件的特定部分,可以使用命令行选项--pages指定要OCR的页面。例如,对于名为“example.PDF”的...
#Display a list of all Tesseract language packsapt-cache search tesseract-ocr#Debian/Ubuntu usersapt-get install tesseract-ocr-chi-sim#Example: Install Chinese Simplified language pack#Arch Linux userspacman -S tesseract-data-eng tesseract-data-deu#Example: Install the English and German language ...
#Display a list of all Tesseract language packsapt-cache search tesseract-ocr#Debian/Ubuntu usersapt-get install tesseract-ocr-chi-sim#Example: Install Chinese Simplified language back You can then pass the-l LANGargument to OCRmyPDF to give a hint as to what languages it should search for....
Release9.0.5.post4+gdf4a8fa2ContentsCHAPTER1IntroductionOCRmyPDFisaPython3packagethataddsOCRlayerstoPDFs.1.1AboutOCROpticalcharacterrecognitionistechnologythatconvertsimagesoftypedorhandwrittentext,suchasinascanneddocument,tocomputertextthatcanbesearchedandcopied.OCRmyPDFusesTesseract,thebestavailableopensourceOCR...
除了所需的Python版本(3.6+)之外,OCRmyPDF还需要安装Ghostscript、Tesseract OCR、QPDF和Leptonica的外部程序。OCRmyPDF是纯Python,但使用CFFI可移植地生成库绑定。OCRmyPDF几乎适用于所有东西:Linux、macOS、Windows和FreeBSD。新闻媒体使用OCRmyPDF实现无纸化 将扫描的文档转换为可搜索的压缩PDF,并带有密文 c't1-...
要扫描PDF文件的特定部分,可以使用命令行选项--pages指定要OCR的页面。例如,对于名为“example.PDF”的...
Python hansmi/baamhackl Star3 Execute command when files are moved to a directory. cligolangocrscannerwatchmaninotifyocrmypdf UpdatedMay 2, 2025 Go ocrmyPDF_Windows is inspired by jbarlow83's ocormtpdf.https://github.com/jbarlow83/OCRmyPDF. ...
In addition to the required Python version, OCRmyPDF requires external program installations of Ghostscript and Tesseract OCR. OCRmyPDF is pure Python, and runs on pretty much everything: Linux, macOS, Windows and FreeBSD. OCRmyPDF would not be the software that it is today without companies...
OCRmyPDF originated as a command line program and continues to have this legacy, but parts of it can be imported and used in other Python applications. Some applications may want to consider running ocrmypdf from a subprocess call anyway, as this provides isolation of its activities. Example ...
In addition to the required Python version (3.6+), OCRmyPDF requires external program installations of Ghostscript, Tesseract OCR, QPDF, and Leptonica. OCRmyPDF is pure Python, but uses CFFI to portably generate library bindings. OCRmyPDF works on pretty much everything: Linux, macOS, ...