Android OCR仅使用流行的tessercat fork tess-two检测数字 [英] Android OCR detecting digits only using popular tessercat fork tess-two

查看:224
本文介绍了Android OCR仅使用流行的tessercat fork tess-two检测数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将流行的OCR tessercat fork用于android tess-two https://github.com /rmtheis/tess-two .我整合了所有员工,并且可以正常工作...

I'm using the popular OCR tessercat fork for android tess-two https://github.com/rmtheis/tess-two. I integrated all the staff and it works etc...

但是我只需要检测数字,现在我的代码是:

But I need to detect only digits, my code for now is:

TessBaseAPI baseApi = new TessBaseAPI();
baseApi.init(pathToLngFile, langName);
baseApi.setImage(bitmap);
String recognizedText = baseApi.getUTF8Text();
baseApi.end();
doSomething(recognizedText); 

从此处 https://code.google.com/p /tesseract-ocr/wiki/FAQ#How_do_I_recognize_only_digits ?

我使用的是V3版本,没有代码解决方案,而是一些命令行解决方案-与android项目无关(我认为...).因此,我尝试为版本<实施解决方案. V3并添加以下行:

I'm using version V3, and there ain't code solution instead some command line solution - not relevant for android project (I think...). So I tried to implement the solution for version < V3 and add this line:

baseApi.SetVariable("tessedit_char_whitelist", "0123456789");

我的问题是如何处理init()?我不需要任何语言,但仍然需要初始化& aint init()方法...

My question is what to do with the init()? I don't need any language, but still I need to init & aint init() method...

更具体

我的最终目标是纯文档(不是纯Excel工作表),看起来像附件的图片(标题和3列,由空格隔开).

My end goal is plain document (not pure Excel sheet), that looks like the attached picture (header & 3 columns separated by white spaces).

我的要求是使数字有意义:能够分开并确定哪些数字属于哪一行和哪一列.

My requirements is to make sense in the digits: To be able to separate and determine which digits belong to which row and column.

谢谢

推荐答案

我想做同样的事情,经过一番研究,我决定捕获所有文本和数字,然后保留数字,这对于我:

I wanted to do the same and after a bit of research I decided to capture all, text and numbers, and then just keep the numbers, this is working for me:

//This Replaces all except numbers from 0 to 9    
recognizedText = recognizedText.replaceAll("[^0-9]+", " "); 

现在您可以对数字进行任何操作.

And now you can do whatever you want with the numbers.

例如,我使用此代码将所有数字分隔成一个String数组,并在TextView上显示它们

For example, I use this code to get all the numbers separated into an String array, and show them on a TextView

String[] justnumbers = recognizedText.trim().split(" "); //Deletes blank spaces and splits the numbers
YourTextView.setText(Arrays.toString(justnumbers).replaceAll("\\[|\\]", "")) //sets the numbers into the TextView and deletes the "[]" from the String Array

您可以在此处看到它.

希望这会有所帮助.

这篇关于Android OCR仅使用流行的tessercat fork tess-two检测数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆