将文本转换为硬拷贝代码 [英] Convert text to hardcopy code

查看:142
本文介绍了将文本转换为硬拷贝代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我想将文本文件转换成硬拷贝,然后通过扫描将其转换回来(不使用普通的OCR)。

我认为最好的方式将数据转换为某种代码(如QR代码或数据矩阵,并将其传播到A4纸张上。我不想将ocx / dll用于这些代码或非常大的源文件)。



是否有一种简单的方法可以创建这样的代码来完成工作,假设不需要旋转和缩放校正,也不需要压缩。



我需要一个简单的算法。



谢谢。

Hi,
I would like to convert text files into hardcopy and then convert them back by scanning (not using regular OCR).
the best way I could think of is converting the data into some kind of code (like QR Code or Data Matrix and spread it all over the A4 sheets. I dont want to use ocx/dll for these codes or very big source files).

Is there an EASY way to create such code that will do the work assuming no rotation and scaling corrections are needed and no compression.

I need a simple algorithem.

Thanks.

推荐答案

谢谢你的澄清(见对问题的评论)。好吧,我必须说,这项任务远非简单。我真的不相信老师的充分性:如果你能给高中生提供如此具有挑战性的任务,不要把愚蠢的教学视为VB,教一些认真的编程。 (在回答这个问题之前,我检查了以色列的教育系统信息以获得我自己的确认:高中真的意味着10-12年级的学生,通常是15到18岁。这意味着没有开始高等数学,只学习代数和几何学的人可能是微积分的基础知识,没有在该领域接受的严格程度。)



问题是可以解决的,但是......解决方案的主要目标是使硬拷贝图像的识别或多或少可靠。当然,打印文本以及因此OCR是不可能的。我会应用类似于某种条形码的东西。可以使用一些标准条形码的库,但我不知道谁会为VB或Matlab编写这样的库。但是,对于许多平台和技术来说,很容易找到。特别是对于.NET,人们可以找到很多解决方案。



所以,我会尝试创建一个非常简化的图形代码。这样的事情:1)绘制一个矩形厚黑框,以定义编码区域的比例和纵横比,2)将框架内的区域细分为几行,每行 - 进入 8 *每个表示单个字节的N 区域,3)将每个字节的区域细分为表示位的8个区域; 4)将文本序列化为一系列字节然后比特,对于每个设置位,将位区域绘制为黑色。



识别应首先确定矩形代码区域,使用框架。这是最具挑战性的部分。事实上,这部分可以跳过。可以假设该软件的用户可以取得扫描仪,非常准确地对齐页面,预扫描它,精确地手动选择代码区域,最后只扫描代码所在的矩形区域。但质量算法应自动确定代码区域。我认为这将是非常先进的要求,并不会坚持它(毕竟,任务并没有严格要求它)。更具挑战性的要求是处理纸张未精确对齐的情况。商业软件要想成功,肯定需要这个功能。



现在,关于比特的识别:我的想法是基于这样一个事实:每个比特矩形区域预先从帧的大小(或所有代码的总大小)已知。位区域的宽高比和每行的字节数( N ,见上文)应该用作预先存在的知识,代码约定的一部分。应在识别期间确定行数(N字节片段)。一个有用的选择是保留第一行来声明编码文本的长度,用无符号字节(或其他无符号整数)表示。



这样,识别每个字节不会涉及搜索其位置。仅仅确定颜色就足够了。它永远不应该是一个像素。解决方案应该在预期的位区域内取一组像素,平均所有这些像素并告诉白色从黑色。例如,在每像素单字节位图编码中,应该将平均像素值与128进行比较。如果超过该值,则颜色为白色(位设置为1),否则为黑色(该位被清除,0),反之亦然。



这项任务将提供另一项测试,诚实的测试。学生是否会参考此CodeProject页面来展示他的作品?这是一个体面的学生应该做的事情。



-SA
Thank you for the clarifications (see the comments to the question). Okay, I must say, the assignment is far from simple. I don't really believe in adequacy of the teacher: if you can give such challenging assignments to the high school students, don't teach stupidities as VB, teach some serious programming. (Before answering this question, I checked up the Israel education system information to get a confirmation for myself: "high school" really means students of 10-12 grades, typically 15 to 18 years old. It means people who did not start advanced mathematics, only leaned school algebra and geometry, may be basics of calculus, without the level of strictness accepted in this field.)

The problem is quite solvable, but… The main goal of the solution is to make the recognition of the "hardcopy" image more or less reliable. Of course, printing of the text and, hence, OCR, would be out of question. I would apply something similar to a kind of a bar code. Some library for of the standard bar codes could be used, but I have no idea who would write such a library for "VB" or "Matlab". However, for many platforms and technologies it would be easy to find. For .NET, in particular, one could find a lot of solutions.

So, I would try to create a very simplified graphical code. Something like this: 1) draw a rectangular thick black frame, to define the scale and aspect ratio of the coded area, 2) subdivide the area inside the frame into several rows, and each row — into 8 * N areas each representing a single byte, 3) subdivide the area of each byte into 8 areas representing bits; 4) serialize text into a series of bytes and then bits, for each set bit, paint the bit area in black.

The recognition should first determine the rectangle of the code area, using the frame. This is the most challenging part. In fact, this part could be skipped. One could assume that the user of this software can take the scanner, align the page very accurately, pre-scan it, manually select the area of the code precisely, and finally scan only the rectangular area where the code is. But the quality algorithm should determine the code area automatically. I think it would be the very advanced requirement and would not insist on it (after all, the assignment did not demand it strictly). Even more challenging requirement would be handling the cases where the paper is not accurately aligned. The commercial software, to be successful, would certainly need this feature.

Now, as to recognition of the bits: my idea is based on the fact that the location of each bit rectangular area is known in advance from the size of the frame (or total size of all the code). The aspect ratio of the bit areas and number of bytes per line (N, see above) should be uses as a preexisting knowledge, part of the code conventions. The number of rows (N-byte fragments) should be determine during recognition. One useful option would be to reserve the first row to declare the length of the coded text, in unsigned byte (or other unsigned integer) representation.

This way, the recognition of each byte would not involve the search for its location. It would be enough just to determine the color. It should never be one pixel. The solution should take a set of pixels inside expected bit area, average all those pixels and tell white from black. For example, in the single-byte-per-pixel bitmap encoding, one should compare the average pixel value with 128. If it is more than that, the color is white (the bit is set, 1), otherwise it is black (the bit is cleared, 0), or visa versa.

This assignment will provide yet another test, the test in honesty. Will the student present his work with the reference to this CodeProject page or not? This is what a decent student should do.

—SA


我已阅读你的说明和谢尔盖的回答。这是我的补充。

首先,让我们来看看QR码,数据矩阵,甚至任何其他1D或2D 条形码 [ ^ ]。它们的构建是为了满足业务需求,以适应业务流程中的某些特殊情况。它们都有某种标记,有助于查找,识别,缩放并因此解码它们。

你需要这两件事:你可以用来对齐(finder)和编码方法。

在您的情况下,由于您扫描整个页面,因此可以直接使用某些2D代码,并且您可以依赖于阅读。所以你不需要特别考虑。我们假设您正在编码字节流,因此编码不是问题。

如果没有精确扫描页面,可以使用对齐标记来纠正页面,并了解读数的大小。

我会使用类似盲文模式的东西,但是一个字节有3x3点。那就是你有一个1位奇偶校验码用于质量目的。 白色和黑色只是位的一个选项,但您也可以使用单元密度(灰度)范围(如TTL级别)。它更加安全无虞。喜欢:0-5%(不是数据,页面的空白区域), 5-20%:零,20-60%:错误区域, 60-75%:一个,75-85%:错误区域,85-100%:为标记保留。
I have read your clarification and Sergey's answer. Here is my addition.
First of all, let's take a look at QR code, datamatrix, or even any other 1D or 2D barcode[^]. They have been built to fulfill business needs, to fit into some special situations during business process. All have some sort of markers, that help finding, identifying, scaling and thus decoding them.
You need these two things: something you can use for alignment (finder) and an encoding method.
In your case, since you scan a whole page it is straightforward to use some 2D code, and you can rely on the reading. So you don't need special considerations. Let's assume, you are encoding a byte stream, thus encoding is not a question.
The alignment markers can be used to rectify the page if it was not scanned absolutely precisely, and to get an idea about the scale of the reading.
I would use something like the Braille pattern, but with 3x3 "dots" for a byte. That let's you have a 1 bit parity code for quality purposes. "white and black" is only one option for the bits, but you can use cell density (gray scale) ranges also (like TTL levels). It is more fail-safe. Like: 0-5% (not data, empty area of the page), 5-20%:zero, 20-60%: error zone, 60-75%:one, 75-85%:error zone, 85-100%: reserved for markers.


这篇关于将文本转换为硬拷贝代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆