表格提取区域坐标表 [英] Tabula extract tables by area coordinates

查看:143
本文介绍了表格提取区域坐标表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们可以选择通过指定PDF坐标来从PDF文档中提取表格.对于Windows用户,为了获取坐标,必须将PDF文件上载到Tabula网页并导出包含坐标的脚本,然后将坐标输入到代码中.对于Mac用户,您只需要使用预览"应用程序和作物检查器.我只是想知道是否有任何第三方程序或插件可以向Windows用户提供此功能?我认为在以下情况下会很方便:

We are given the option to extract tables from a PDF document by specifying its coordinates. For windows users, in order to get the coordinates, you have to upload the PDF file to Tabula web page and export the script which contains the coordinates then input the coordinates into your code. For Mac users, you just have to use the Preview app and the crop inspector. I'm just wondering if there are any third party programs or plug-ins which offer this to Windows user? I think this will be handy under the following situation:

  1. 当您无法访问互联网时.
  2. 我认为预览应用程序会更准确,因为我体验过Tabula网页生成的坐标不准确.

如果有人能指出我在哪里可以找到这样的东西,将不胜感激.非常感谢.

Will be grateful if anyone can point me to where I can find such thing. Much thanks.

推荐答案

Tabula需要以PDF单位指定区域,该区域定义为1/72英寸.如果使用Acrobat Reader DC,则可以使用测量"工具并将其读数乘以72.

Tabula needs areas to be specified in PDF units, which are defined to be 1/72 of an inch. If using Acrobat Reader DC, you can use the Measure tool and multiply its readings by 72.

Tabula需要指定为 top left bottom right 距离的区域.要获取它们,您可以测量从页面的 top 到表格开头的距离,依此类推.

Tabula needs the area to be specified as the top, left, bottom and right distances. To obtain them, you can measure the distances from the top of the page to the beginning of the table and so on.

这篇关于表格提取区域坐标表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆