使用机器学习对图像文档进行分类 [英] Classifying image documents using Machine learning

查看:70
本文介绍了使用机器学习对图像文档进行分类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用机器学习对图像文档(例如护照,驾驶执照等)进行分类.没有人有任何链接或文档,我可以从中获得执行此任务的想法.

I want to classify image documents(like Passport, Driving Licence etc) using Machine Learning. Does anybody has any link or documents where I can get idea to do this task.

我想的是先将文档转换为文本格式,然后再从文本文件中提取信息.但是我一次只能处理一个文件.我想知道如何在数百万个文档中执行此操作.

What I am thinking is of first converting the document to text format and then fro Text file extract the information.But this I can do with one file at a time. I want to know how can I perform this in millions of document.

推荐答案

您不需要将文档转换为文本,可以直接使用图像来实现.

You don't need to convert documents to text, you can do this with images directly.

要进行图像分类,您可以使用Keras库构建基本的CNN.

To do image classification you can build basic CNNs with Keras library.

https://towardsdatascience.com/building-a-convolutional-neural-network-cnn-in-keras-329fbbadc5f5

这个基本的CNN足以让您训练图像分类器.但是,您想获得最先进的准确性,我建议您获得预训练的resnet50并对其进行训练以构建图像分类器.除了准确性之外,使用预先训练的网络还有另一个主要优势,您需要的数据更少,可以训练出一个强大的图像分类器.

This basic CNN will be enough for you to train an image classifier. But you want to get state of the art accuracy, I recommend get a pretrained resnet50 and train it to build an image classifier. Other than accuracy, there is another major advantage of using pre trained network, you'll need less data to train a robust image classifier.

https://engmrk.com/kerasapplication-pre-trained-model/?utm_campaign=News&utm_medium=Community&utm_source=DataCamp.com

您唯一需要更改的是将输出类的数量从1000更改为所需的类数.

The only thing that you'll need to change is number of output classes from 1000 to the number of classes you want.

这篇关于使用机器学习对图像文档进行分类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆