如何使用python更改pdf内的超链接? [英] how do i change hyperlinks inside pdf using python?

查看:695
本文介绍了如何使用python更改pdf内的超链接?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使用python更改pdf中的超链接?我目前正在使用pyPDF2打开并循环浏览页面.我如何实际扫描超链接,然后继续更改超链接?

How do I change the hyperlinks in pdf using python? I am currently using a pyPDF2 to open up and loop through the pages. How do I actually scan for hyperlinks and then proceed to change the hyperlinks?

推荐答案

所以我无法使用但是我确实可以使用其他库: pdfrw .使用Python 3.6中的pip,这对我来说安装得很好:

I did however get something working with another library: pdfrw. This installed fine for me using pip in Python 3.6:

pip install pdfrw

注意:以下内容,我一直在使用此示例pdf 我发现在线包含多个链接.您的里程可能与此不同.

Note: for the following I have been using this example pdf I found online which contains multiple links. Your mileage may vary with this.

import pdfrw

pdf = pdfrw.PdfReader("pdf.pdf") #Load the pdf
new_pdf = pdfrw.PdfWriter() #Create an empty pdf

for page in pdf.pages: #Go through the pages
    for annot in page.Annots or []: #Links are in Annots, but some pages
                                    #don't have links so Annots returns None
        old_url = annot.A.URI

        #>Here you put logic for replacing the URLs<

        #Use the PdfString object to do the encoding for us.
        # Note the brackets around the URL here.
        new_url = pdfrw.objects.pdfstring.PdfString("(http://www.google.com)")

        #Override the URL with ours.
        annot.A.URI = new_url

    new_pdf.addpage(page)    

new_pdf.write("new.pdf")

这篇关于如何使用python更改pdf内的超链接?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆