如何使用谷歌云视觉以及 unity 使用移动相机识别文本? [英] How to use google cloud vision along with unity for recognising text using mobile camera?

查看:21
本文介绍了如何使用谷歌云视觉以及 unity 使用移动相机识别文本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在测试一个关于如何使用谷歌云视觉从对象和图片中读取文本的项目.使用移动相机(iphone、ipad 最好或安卓手机)我想获得所需的文本.三星 bixby 应用程序就是一个例子.经过一些阅读,我发现了统一的 OpenCV 和谷歌云视觉.统一的 OpenCV 大约是 95 美元.为了测试,我无法使用它.所以我选择了另一个选项.

我下载了这个项目.

解决方案

Unity Cloud Vision git repo包含人脸检测的代码.不适用于 OCR 或文本检测.

因此,我创建了一个代码,用于在 Unity3D 中使用视觉 OCR api 从图像中执行文本检测.

您可以尝试使用以下脚本在 Unity3D 中检测图像中的文本.

使用UnityEngine;使用 System.Collections;使用 System.Collections.Generic;使用 UnityEngine.UI;使用 SimpleJSON;公共类 WebCamTextureToCloudVision : MonoBehaviour {公共字符串 url = "https://vision.googleapis.com/v1/images:annotate?key=";公共字符串 apiKey = "";//把你的谷歌云视觉api密钥放在这里公共浮点数 captureIntervalSeconds = 5.0f;公共 int 请求宽度 = 640;公共 int 请求高度 = 480;public FeatureType featureType = FeatureType.TEXT_DETECTION;公共 int maxResults = 10;公共游戏对象 resPanel;公共文本响应文本,响应数组;WebCamTexture 网络摄像头纹理;纹理2D 纹理2D;字典<字符串,字符串>标题;[System.Serializable]公共类 AnnotateImageRequests {公共列表要求;}[System.Serializable]公共类 AnnotateImageRequest {公共形象形象;公共列表<功能>特征;}[System.Serializable]公共类图像{公共字符串内容;}[System.Serializable]公共类功能{公共字符串类型;公共 int maxResults;}公共枚举 FeatureType {TYPE_UNSPECIFIED,人脸检测,LANDMARK_DETECTION,LOGO_检测,LABEL_DETECTION,TEXT_DETECTION,SAFE_SEARCH_DETECTION,图像_属性}//用于初始化无效开始(){headers = new Dictionary();headers.Add("Content-Type", "application/json; charset=UTF-8");if (apiKey == null || apiKey == "")Debug.LogError("No API key. Please set your API key into the \"Web Cam Texture To Cloud Vision(Script)\" component.");WebCamDevice[] devices = WebCamTexture.devices;for (var i = 0; i  0){webcamTexture = new WebCamTexture(devices[0].name,requestedWidth,requestedHeight);渲染器 r = GetComponent<渲染器>();如果(r != null){材料 m = r.material;如果(米!=空){m.mainTexture = webcamTexture;}}网络摄像头Texture.Play();StartCoroutine(捕获");}}//每帧调用一次更新无效更新(){}私人 IEnumerator 捕获(){而(真){如果(this.apiKey == null)收益率返回空;yield return new WaitForSeconds(captureIntervalSeconds);颜色 [] 像素 = webcamTexture.GetPixels();如果(像素.长度== 0)收益率返回空;if (texture2D == null || webcamTexture.width != texture2D.width || webcamTexture.height != texture2D.height) {texture2D = new Texture2D(webcamTexture.width, webcamTexture.height, TextureFormat.RGBA32, false);}纹理2D.SetPixels(像素);//texture2D.Apply(false);//不需要.因为我们不需要将它上传到GPU字节[] jpg = 纹理2D.EncodeToJPG();字符串 base64 = System.Convert.ToBase64String(jpg);//#if UNITY_WEBGL//Application.ExternalCall("post", this.gameObject.name, "OnSuccessFromBrowser", "OnErrorFromBrowser", this.url + this.apiKey, base64, this.featureType.ToString(), this.maxResults);//#别的AnnotateImageRequests requests = new AnnotateImageRequests();requests.requests = new List();AnnotateImageRequest request = new AnnotateImageRequest();request.image = new Image();request.image.content = base64;request.features = new List();功能功能=新功能();feature.type = this.featureType.ToString();feature.maxResults = this.maxResults;request.features.Add(feature);请求.请求.添加(请求);字符串 jsonData = JsonUtility.ToJson(requests, false);if (jsonData != string.Empty) {字符串 url = this.url + this.apiKey;byte[] postData = System.Text.Encoding.Default.GetBytes(jsonData);using(WWW www = new WWW(url, postData, headers)) {收益率返回 www;if (string.IsNullOrEmpty(www.error)) {字符串响应 = www.text.Replace("\n", "").Replace(" ", "");//Debug.Log(responses);JSONNode res = JSON.Parse(responses);string fullText = res["responses"][0]["textAnnotations"][0]["description"].ToString().Trim('"');如果(全文!="){Debug.Log("OCR 响应:"+全文);resPanel.SetActive(true);responseText.text = fullText.Replace("\\n", "");fullText = fullText.Replace("\\n", ";");string[] texts = fullText.Split(';');responseArray.text = "";for(int i=0;i

演示项目在 github 中可用.codemaker2015/google-cloud-vision-api-ocr-unity3d-演示

I am testing on a project on how to read text from objects and pictures using google cloud vision.Using mobile camera(iphone,ipad preferably or android phones)I would like to get the required text.Samsung bixby application is an example.After some reading I found out about OpenCV for unity and Google cloud vision.OpenCV for unity is around 95$.For testing I cannot use it.So I took the other option.

I downloaded this project. Github project .I created a google cloud vision api key and added to the inspector.I have set the option feature type to text detection.When I took a IOS build, the camera was ON but looks inverted.Nothing was happening.I see a missing script in the inspector.How to detect the text using device camera?

解决方案

Unity Cloud Vision git repo contains the code for Face Detection. It is not suitable for OCR or Text Detection.

So, I have created a code for perform text detection from images using vision OCR api in Unity3D.

You can try to use the following script to detect the text from an image in Unity3D.

using UnityEngine;
using System.Collections;
using System.Collections.Generic;
using UnityEngine.UI;
using SimpleJSON;

public class WebCamTextureToCloudVision : MonoBehaviour {

    public string url = "https://vision.googleapis.com/v1/images:annotate?key=";
    public string apiKey = ""; //Put your google cloud vision api key here
    public float captureIntervalSeconds = 5.0f;
    public int requestedWidth = 640;
    public int requestedHeight = 480;
    public FeatureType featureType = FeatureType.TEXT_DETECTION;
    public int maxResults = 10;
    public GameObject resPanel;
    public Text responseText, responseArray; 

    WebCamTexture webcamTexture;
    Texture2D texture2D;
    Dictionary<string, string> headers;

    [System.Serializable]
    public class AnnotateImageRequests {
        public List<AnnotateImageRequest> requests;
    }

    [System.Serializable]
    public class AnnotateImageRequest {
        public Image image;
        public List<Feature> features;
    }

    [System.Serializable]
    public class Image {
        public string content;
    }

    [System.Serializable]
    public class Feature {
        public string type;
        public int maxResults;
    }

    public enum FeatureType {
        TYPE_UNSPECIFIED,
        FACE_DETECTION,
        LANDMARK_DETECTION,
        LOGO_DETECTION,
        LABEL_DETECTION,
        TEXT_DETECTION,
        SAFE_SEARCH_DETECTION,
        IMAGE_PROPERTIES
    }

    // Use this for initialization
    void Start () {
        headers = new Dictionary<string, string>();
        headers.Add("Content-Type", "application/json; charset=UTF-8");

        if (apiKey == null || apiKey == "")
            Debug.LogError("No API key. Please set your API key into the \"Web Cam Texture To Cloud Vision(Script)\" component.");
        
        WebCamDevice[] devices = WebCamTexture.devices;
        for (var i = 0; i < devices.Length; i++) {
            Debug.Log (devices [i].name);
        }
        if (devices.Length > 0) {
            webcamTexture = new WebCamTexture(devices[0].name, requestedWidth, requestedHeight);
            Renderer r = GetComponent<Renderer> ();
            if (r != null) {
                Material m = r.material;
                if (m != null) {
                    m.mainTexture = webcamTexture;
                }
            }
            webcamTexture.Play();
            StartCoroutine("Capture");
        }   
    }
    
    // Update is called once per frame
    void Update () {

    }

    private IEnumerator Capture() {
        while (true) {
            if (this.apiKey == null)
                yield return null;

            yield return new WaitForSeconds(captureIntervalSeconds);

            Color[] pixels = webcamTexture.GetPixels();
            if (pixels.Length == 0)
                yield return null;
            if (texture2D == null || webcamTexture.width != texture2D.width || webcamTexture.height != texture2D.height) {
                texture2D = new Texture2D(webcamTexture.width, webcamTexture.height, TextureFormat.RGBA32, false);
            }

            texture2D.SetPixels(pixels);
            // texture2D.Apply(false); // Not required. Because we do not need to be uploaded it to GPU
            byte[] jpg = texture2D.EncodeToJPG();
            string base64 = System.Convert.ToBase64String(jpg);
// #if UNITY_WEBGL  
//          Application.ExternalCall("post", this.gameObject.name, "OnSuccessFromBrowser", "OnErrorFromBrowser", this.url + this.apiKey, base64, this.featureType.ToString(), this.maxResults);
// #else
            
            AnnotateImageRequests requests = new AnnotateImageRequests();
            requests.requests = new List<AnnotateImageRequest>();

            AnnotateImageRequest request = new AnnotateImageRequest();
            request.image = new Image();
            request.image.content = base64;
            request.features = new List<Feature>();
            Feature feature = new Feature();
            feature.type = this.featureType.ToString();
            feature.maxResults = this.maxResults;
            request.features.Add(feature); 
            requests.requests.Add(request);

            string jsonData = JsonUtility.ToJson(requests, false);
            if (jsonData != string.Empty) {
                string url = this.url + this.apiKey;
                byte[] postData = System.Text.Encoding.Default.GetBytes(jsonData);
                using(WWW www = new WWW(url, postData, headers)) {
                    yield return www;
                    if (string.IsNullOrEmpty(www.error)) {
                        string responses = www.text.Replace("\n", "").Replace(" ", "");
                        // Debug.Log(responses);
                        JSONNode res = JSON.Parse(responses);
                        string fullText = res["responses"][0]["textAnnotations"][0]["description"].ToString().Trim('"');
                        if (fullText != ""){
                            Debug.Log("OCR Response: " + fullText);
                            resPanel.SetActive(true);
                            responseText.text = fullText.Replace("\\n", " ");
                            fullText = fullText.Replace("\\n", ";");
                            string[] texts = fullText.Split(';');
                            responseArray.text = "";
                            for(int i=0;i<texts.Length;i++){
                                responseArray.text += texts[i];
                                if(i != texts.Length - 1)
                                    responseArray.text += ", ";
                            }
                        }
                    } else {
                        Debug.Log("Error: " + www.error);
                    }
                }
            }
// #endif
        }
    }

#if UNITY_WEBGL
    void OnSuccessFromBrowser(string jsonString) {
        Debug.Log(jsonString);  
    }

    void OnErrorFromBrowser(string jsonString) {
        Debug.Log(jsonString);  
    }
#endif

}

The demo project is available in github. codemaker2015/google-cloud-vision-api-ocr-unity3d-demo

这篇关于如何使用谷歌云视觉以及 unity 使用移动相机识别文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆