A4页层次分段 [英] A4 page heirarchy segmenation

查看:84
本文介绍了A4页层次分段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将一个Image(基本上是text / doc Image)分割成页眉页脚体和体内提取段.Image是二进制形式,即m * n矩阵元素只包含0和1

0-用于黑色像素即文本和1用于白色像素无文本

我这样想我从顶部开始进行标题水平扫描直到文本行检测到并从底部开始从页面开始到顶部直到文本行检测(黑色像素)剩余部分将是正文但我无法编写代码



我尝试过:



我在matlab中试过,但我想用c ++编写

I want to segment an Image(basically text/doc Image) into header footer body and in body extracting paragraph.Image is in binary form i.e m*n matrix element contain only 0 and 1
0-for black pixel i.e text and 1-for white pixel no text
I am thinking like this I start from top for header horizontal scan till a text line detect and for footer start from bottom go to top until a text line detect(black pixel) remaining part will be body but I am not able to write a code

What I have tried:

I tried in matlab but I want to write in c++

推荐答案

你需要从头开始获取页面的尺寸并计算要打印到的每个部分的高度和宽度。然后在每个页面部分中计算段落,列,边距等的大小。尝试在真实的A4上绘制框,然后看看你需要做些什么来计算它们的大小。
You need to start by getting the dimensions of the page and calculating the height and width of each section that you will print into. Then within each page section you calculate the size of a paragraph, column, margin etc. Try drawing the boxes on a real piece of A4 and then see what you need to do to calculate their sizes.


这篇关于A4页层次分段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆