Tesseract ocr 返回空字符串 [英] Tesseract ocr returns null string

查看:58
本文介绍了Tesseract ocr 返回空字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为 android 构建一个 OCR 应用程序,我使用 tesseract ocr 引擎.不知何故,每次我在照片上使用引擎时,它都会返回一个空文本.这是我的代码:

I am building an OCR app for android and i use tesseract ocr engine. Somehow every time i use the engine on a photo it returns an empty text. This is my code:

public String detectText(Bitmap bitmap) {
    TessBaseAPI tessBaseAPI = new TessBaseAPI();
    String mDataDir = setTessData();
    tessBaseAPI.setDebug(true);
    tessBaseAPI.init(mDataDir + File.separator, "eng");
    tessBaseAPI.setImage(bitmap);
    tessBaseAPI.setPageSegMode(TessBaseAPI.OEM_TESSERACT_ONLY);
    String text = tessBaseAPI.getUTF8Text();

    tessBaseAPI.end();

    return text;
}

private String setTessData(){
    String mDataDir = this.getExternalFilesDir("data").getAbsolutePath();
    String mTrainedDataPath = mDataDir + File.separator + "tessdata";
    String mLang = "eng";
    // Checking if language file already exist inside data folder
    File dir = new File(mTrainedDataPath);
    if (!dir.exists()) {
        if (!dir.mkdirs()) {
            //showDialogFragment(SD_ERR_DIALOG, "sd_err_dialog");
        } else {
        }
    }

    if (!(new File(mTrainedDataPath + File.separator + mLang + ".traineddata")).exists()) {

        // If English or Hebrew, we just copy the file from assets
        if (mLang.equals("eng") || mLang.equals("heb")){
            try {
                AssetManager assetManager = context.getAssets();
                InputStream in = assetManager.open(mLang + ".traineddata");
                OutputStream out = new FileOutputStream(mTrainedDataPath + File.separator + mLang + ".traineddata");
                copyFile(in, out);
                //Toast.makeText(context, getString(R.string.selected_language) + " " + mLangArray[mLangID], Toast.LENGTH_SHORT).show();
                //Log.v(TAG, "Copied " + mLang + " traineddata");
            } catch (IOException e) {
                //showDialogFragment(SD_ERR_DIALOG, "sd_err_dialog");
            }
        }

        else{

            // Checking if Network is available
            if (!isNetworkAvailable(this)){
                //showDialogFragment(NETWORK_ERR_DIALOG, "network_err_dialog");
            }
            else {
                // Shows a dialog with File dimension. When user click on OK download starts. If he press Cancel revert to english language (like NETWORK ERROR)
                //showDialogFragment(CONTINUE_DIALOG, "continue_dialog");
            }
        }
    }
    else {
        //Toast.makeText(mThis, getString(R.string.selected_language) + " " + mLangArray[mLangID], Toast.LENGTH_SHORT).show();
    }
    return mDataDir;
}

我已经对其进行了多次调试,并且位图正在正确传输到detectText 方法.手机上存在语言数据文件(tessdata),路径也是正确的.

I have debugged it many times and the bitmap is being transferred correctly to the detectText method. The language data files(tessdata) exists on the phone and the path to them is also correct.

有人知道这里有什么问题吗?

Does anybody knows what the problem here?

推荐答案

您正在使用 OCR 引擎模式枚举值在 setTessData() 方法中设置页面分段.

You are using the OCR Engine Mode Enum value for setting the page segmentation in your setTessData() method.

setTessData() {
    ...
    tessBaseAPI.setPageSegMode(TessBaseAPI.OEM_TESSERACT_ONLY);
}

根据您尝试检测字符的图像类型,设置适当的页面分割模式将有助于检测字符.

Based on the type of image on which you are trying to detect the characters, setting an appropriate Page segmentation mode will help detect the characters.

例如:

tessBaseAPI.setPageSegMode(TessBaseAPI.PageSegMode.PSM_AUTO);

各种其他页面分段值存在于 TessBaseApi.java 中:

The various other Page segmentation values are present in TessBaseApi.java :

/** Page segmentation mode. */
public static final class PageSegMode {
    /** Orientation and script detection only. */
    public static final int PSM_OSD_ONLY = 0;

    /** Automatic page segmentation with orientation and script detection. (OSD) */
    public static final int PSM_AUTO_OSD = 1;

    /** Fully automatic page segmentation, but no OSD, or OCR. */
    public static final int PSM_AUTO_ONLY = 2;

    /** Fully automatic page segmentation, but no OSD. */
    public static final int PSM_AUTO = 3;

    /** Assume a single column of text of variable sizes. */
    public static final int PSM_SINGLE_COLUMN = 4;

    /** Assume a single uniform block of vertically aligned text. */
    public static final int PSM_SINGLE_BLOCK_VERT_TEXT = 5;

    /** Assume a single uniform block of text. (Default.) */
    public static final int PSM_SINGLE_BLOCK = 6;

    /** Treat the image as a single text line. */
    public static final int PSM_SINGLE_LINE = 7;

    /** Treat the image as a single word. */
    public static final int PSM_SINGLE_WORD = 8;

    /** Treat the image as a single word in a circle. */
    public static final int PSM_CIRCLE_WORD = 9;

    /** Treat the image as a single character. */
    public static final int PSM_SINGLE_CHAR = 10;

    /** Find as much text as possible in no particular order. */
    public static final int PSM_SPARSE_TEXT = 11;

    /** Sparse text with orientation and script detection. */
    public static final int PSM_SPARSE_TEXT_OSD = 12;

    /** Number of enum entries. */
    public static final int PSM_COUNT = 13;
}

您可以尝试使用不同的页面分段枚举值,看看哪个给出了最好的结果.

You can experiment with different page segmentation enum values and see which gives the best result.

这篇关于Tesseract ocr 返回空字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆