寻找一些python机器学习建议 [英] Looking for a little python machine learning advice

查看：89 发布时间：2020/5/4 9:43:25 python machine-learning image-recognition text-recognition

本文介绍了寻找一些python机器学习建议的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我对涉足Python和机器学习/自动数据输入感兴趣.但是，随着研究的进展，我意识到有很多不同的技术，每种技术都有自己的优势.

I'm interested in having a dabble with Python and machine learning/automatic data entry. However as my research has progressed I realise there are so many different techniques each with there own strengths.

我已经决定，如果我朝相反的方向学习，我可能会走得更远. IE.选择一个问题/任务并通过解决/完成它来学习.

I've decided i might get further if i learn in the opposite direction. I.e. pick a problem/task and learn by solving/completing it.

我有时不得不处理传真发票，我希望制作一个程序，一旦我扫描然后输入，便可以为我输入.

I occasionally have to data process invoices that are faxed, I'm hoping to make a program that can enter these for me once I've scanned then in.

传真基本上由2个相同的表组成.每行表示一个单独的工人.第一列是工人名称(选择6)，第二列是地址，然后其余列是勾号框，表示不同的工作.页面顶部的框中还有一个发票ID.

The faxes basically consist of 2 identical tables. Each row denotes a seperate worker. The 1st column is for a workers name(a choice of 6) 2nd is an address then the rest of the columns are tick boxes which denote different jobs. There is also an invoice ID in a box at the top of the page.

我希望有人能简要解释他们将如何处理.他们是否将SVM用于文本识别或其他技术?以及如何使程序理解第5个方框中的勾号，表示"cleaned = yes"，并且左上方方框中的数字是ID.我做了一些研究，但无法理解如何开始.如何隔离传真的各个部分，例如当您由于传真/扫描而无法保证绝对的放置/尺寸时，顶部的表格以及它是页面其余部分的单元格.还是我必须获取数百个传真+这些传真的类型化数据然后进行比较，然后使其慢慢了解自己，所以传真a和b的区别在这里是勾号，而ID号通常在这里...

I'm hoping for someone to briefly explain how they would go about this. If they would use SVM for text recognition or another technique? and how you could go about making a program understand a tick in the 5th box along means 'cleaned=yes' and that the number in the top left box is the ID. Ive done a bit of research but can't get my head around how to start. How is it possible to isolate parts of a fax e.g. The top table and it's cells from the rest of the page when you can't guarantee absolute placement/size due to the fax/scans. Or do I have to get hundreds of faxes + the typed up data of these faxes then compare them and then get it to slowly learn itself the difference between fax a and b is a tick here, and the ID number is usually here...

欢迎任何建议！

寻找一些python机器学习建议 [英] Looking for a little python machine learning advice

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

寻找一些python机器学习建议 [英] Looking for a little python machine learning advice

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭