使用Python从采购订单（PDF文件）中提取密钥及其相关值 [英] Extracting key and its related value from purchase order (PDF file) using Python

查看：144 发布时间：2019/6/7 19:40:33 Python PDF

本文介绍了使用Python从采购订单（PDF文件）中提取密钥及其相关值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

Python版本：3

输入：包含采购订单的PDF文件输入示例：http：//gem.compaq.com/gemstore/sites/downloads/SLED_PO_Template .pdf

注意：这是空的采购订单样本格式，实际格式可能会有所不同。实时pdf可能不是空的。

所需输出是从pdf获取密钥名称及其值。

样品输出：

采购订单编号：其pdf值（其他按键相同）

问题：如何从给定的pdf文件中提取密钥名称及其相关值数据？

我尝试过：

尝试tabula-py，pdfminer2，pdftotext，OCR，pdf2json。

但我面临的主要挑战是：将关键字与其真实值相关联。

Python Version: 3

Input: PDF file containing Purchase order Input Example: http://gem.compaq.com/gemstore/sites/downloads/SLED_PO_Template.pdf

Note: This is empty purchase order sample format, actual Format may vary. In real time pdf may not be empty.

Desired Output is to get key name and its value from pdf.

Sample Output:

PO number: its value in pdf (Same for other keys)

Question: How to extract name of keys and its relevant value data from given pdf file?

What I have tried:

Tried tabula-py, pdfminer2, pdftotext, OCR, pdf2json.
But main challenge I am facing is: Relating key with its true value.

使用Python从采购订单（PDF文件）中提取密钥及其相关值 [英] Extracting key and its related value from purchase order (PDF file) using Python

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

使用Python从采购订单（PDF文件）中提取密钥及其相关值 [英] Extracting key and its related value from purchase order (PDF file) using Python

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭