如何在没有Visual Studio的Windows上安装Leptonica + tesseract以在Anaconda中使用? [英] How to install leptonica+tesseract on Windows without Visual Studio to use in Anaconda?

查看:358
本文介绍了如何在没有Visual Studio的Windows上安装Leptonica + tesseract以在Anaconda中使用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从图像中执行文本识别,我想使用Python.我安装了Anaconda.现在,我想安装Tesseract,但我还需要安装Leptonica.我没有找到任何明确的说明如何在Windows中执行此操作.对于Leptonica,我不想安装Visual Studio. 因此,有人可以提供明确的说明,如何在不使用Visual Studio的Anaconda中在Windows上安装leptonica和tesseract的情况下吗? 谢谢.

I wanted to perform text recognition from images and I want to use Python. I installed Anaconda. Now I want to install Tesseract but I also need to install Leptonica. I did not find any clear instruction how to do it in windows. For Leptonica I do not want to install Visual Studio. So could anybody provide clear instructions how to install leptonica and tesseract on Windows without Visual Studio to use in anaconda ? Thanks.

推荐答案

以下是从2016年4月22日起使tesseract 3.05开发人员版本在Windows 7和Windows 8机器上均可运行的简单步骤:

Here is simple set of steps to have tesseract 3.05 dev version as of 04/22/2016 working both on windows 7 and windows 8 machines:

1-从tesseract-ocr官方页面的可执行文件中安装tesseract(仅适用于Windoes的3.02版)

1- install tesseract from its executable from official tesseract-ocr page (version 3.02 for windoes will suffice)

2-从 http://domasofan.spdns.eu下载tesseract 3.05开发版本的以下两个文件/tesseract/

有2个exe文件:

  • tesseract-core-yyyymmdd.exe 没有语言数据的Tesseract核心应用程序
  • tesseract-langs-yyyymmdd.exe 所有适用于Tesseract的语言数据.
  • tesseract-core-yyyymmdd.exe Tesseract core application without language data
  • tesseract-langs-yyyymmdd.exe All the language data available for Tesseract.

(yyyymmdd表示年4位数字,月2位数字和日2位数字.)

(yyyymmdd means year 4 digits, month 2 digits and day 2 digits.)

该应用程序是便携式的,因此您可以将其安装在USB记忆棒上或其他位置.

The app is portable so you can install it on a USB stick or in another location.

用于安装这些软件的子步骤:

sub Steps to install these:

  1. 下载tesseract-core和tesseract-langs软件包.
  2. 双击tesseract-core软件包并将其解压缩到您想要的目录(名为"Tess_temp"的临时新文件夹).
  3. 双击tesseract-langs软件包并将其解压缩到同一目录,但在上面的"Tess_temp"文件夹中将\ tessdata添加到其中. 例如,如果我将tesseract-core提取到c:\ Tess_temp,则tesseract-langs需要转到c:\ Tess_temp \ tessdata.

  1. Download the tesseract-core and tesseract-langs packages.
  2. Double click the tesseract-core package and extract it to a directory where you want it to be (a temporary new folder called "Tess_temp").
  3. Double click the tesseract-langs package and extract it to the same directory but add \tessdata to it in the above "Tess_temp" folder. For example if i would have extracted tesseract-core to c:\Tess_temp, tesseract-langs needs to go to c:\Tess_temp\tessdata.

现在将"Tess_temp"中的内容复制到上述步骤1中安装了tesseract 3.02的位置(通常在C:\ Program Files(x86)\ Tesseract-OCR中)(用3.05替换3.02材料)

Now copy what ever you have in "Tess_temp" to where tesseract 3.02 was installed in step 1 above (its usially in C:\Program Files (x86)\Tesseract-OCR) (replace 3.02 materials with 3.05 )

它现在应该可以在Windows上的3.05版本中使用. 将样本图像test.png(带有文本)复制到此tesseract-ocr文件夹中,然后打开一个cmd并键入以下命令:

It should work now with the 3.05 version on windows. copy a sample image test.png (with text) to this tesseract-ocr folder and open a cmd and type in the following commands:

转到tesseract文件夹:cd C:\Program Files <x86>\Tesseract-OCR

go to tesseract folder: cd C:\Program Files <x86>\Tesseract-OCR

在test.png上运行tesseract:tesseract -l eng test.png test_text -psm 6

run tesseract on test.png: tesseract -l eng test.png test_text -psm 6

它将显示给您

Tesseract Open Source OCR Engine v3.05.00dev with Leptonica

恭喜! (检查test_txt.txt中提取的文本)

congratulations ! (check test_txt.txt for the extracted text)

这篇关于如何在没有Visual Studio的Windows上安装Leptonica + tesseract以在Anaconda中使用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆