从python或bash批量填充PDF表单 [英] Batch fill PDF forms from python or bash

查看:225
本文介绍了从python或bash批量填充PDF表单的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个PDF格式的表格,需要多次填写(这是一个准确的时间表)。现在因为我不想手动执行此操作,所以我正在寻找一种方法来使用Python脚本或可用于bash脚本的工具来填充它们。



有人有这方面的经验吗?

解决方案

对于Python,您需要fdfgen lib和pdftk



Bothwell的评论是100%正确的,所以我会用一个工作实现扩展这个答案。



如果您使用的是Windows,则还需要确保系统路径中包含python和pdftk(除非您想使用长文件夹名称)。



以下是从CSV数据文件中自动批量填充PDF表单集合的代码:

  import csv 
from fdfgen import forge_fdf
import os
import sys
$ b sys.path.insert(0,os.getcwd())
filename_prefix =NVC
csv_file =NVC.csv
pdf_file =NVC.pdf
tmp_file =tmp.fdf
output_folder = './output/'

def process_csv(file):
headers = []
data = []
csv_data = csv.reader(open(file) )
为我,列中的行(csv_data):
如果我== 0:
标题=行
继续;
field = []
for i in range(len(headers)):
field.append((headers [i],row [i]))
data.append (field)
返回数据

def form_fill(fields):
fdf = forge_fdf(,fields,[],[],[])
fdf_file = open(tmp_file,w)
fdf_file.write(fdf)
fdf_file.close()
output_file ='{0} {1} {2} .pdf'.format( output_folder,filename_prefix,fields [1] [1])$ ​​b $ b cmd ='pdftk{0}fill_form{1}输出{2}dont_ask'.format(pdf_file,tmp_file,output_file)
os.system(cmd)
os.remove(tmp_file)

data = process_csv(csv_file)
print('Generating Forms:')
print( '-----------------------')
for data in:
if i [0] [1] =='是':
continue
print('{0} {1} created ...'。format(filename_prefix,i [1] [1]))
form_fill(i)

注意:找出如何定制这个不应该是火箭手术。



在CSV中,第一行中每列将包含PDF中相应字段名称的名称文件。任何在模板中没有相应字段的列都将被忽略。



在PDF模板中,只需创建希望填充数据的可编辑字段,并确保名称与CSV数据匹配。



对于此特定配置,只需将此文件放在与NVC.csv,NVC.pdf和文件夹相同的文件夹中命名为'输出'。运行它,它自动完成剩下的工作。


I have a PDF form that needs to be filled out a bunch of times (it's a timesheet to be exact). Now since I don't want to do this by hand, I was looking for a way to fill them out using a python script or tools that could be used in a bash script.

Does anyone have experience with this?

解决方案

For Python you'll need the fdfgen lib and pdftk

@Hugh Bothwell's comment is 100% correct so I'll extend that answer with a working implementation.

If you're in windows you'll also need to make sure both python and pdftk are contained in the system path (unless you want to use long folder names).

Here's the code to auto-batch-fill a collection of PDF forms from a CSV data file:

import csv
from fdfgen import forge_fdf
import os
import sys

sys.path.insert(0, os.getcwd())
filename_prefix = "NVC"
csv_file = "NVC.csv"
pdf_file = "NVC.pdf"
tmp_file = "tmp.fdf"
output_folder = './output/'

def process_csv(file):
    headers = []
    data =  []
    csv_data = csv.reader(open(file))
    for i, row in enumerate(csv_data):
      if i == 0:
        headers = row
        continue;
      field = []
      for i in range(len(headers)):
        field.append((headers[i], row[i]))
      data.append(field)
    return data

def form_fill(fields):
  fdf = forge_fdf("",fields,[],[],[])
  fdf_file = open(tmp_file,"w")
  fdf_file.write(fdf)
  fdf_file.close()
  output_file = '{0}{1} {2}.pdf'.format(output_folder, filename_prefix, fields[1][1])
  cmd = 'pdftk "{0}" fill_form "{1}" output "{2}" dont_ask'.format(pdf_file, tmp_file, output_file)
  os.system(cmd)
  os.remove(tmp_file)

data = process_csv(csv_file)
print('Generating Forms:')
print('-----------------------')
for i in data:
  if i[0][1] == 'Yes':
    continue
  print('{0} {1} created...'.format(filename_prefix, i[1][1]))
  form_fill(i)

Note: It shouldn't be rocket-surgery to figure out how to customize this. The initial variable declarations contain the custom configuration.

In the CSV, in the first row each column will contain the name of the corresponding field name in the PDF file. Any columns that don't have corresponding fields in the template will be ignored.

In the PDF template, just create editable fields where you want your data to fill and make sure the names match up with the CSV data.

For this specific configuration, just put this file in the same folder as your NVC.csv, NVC.pdf, and a folder named 'output'. Run it and it automagically does the rest.

这篇关于从python或bash批量填充PDF表单的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆