pyPDF2 TypeError尝试提取文本时 [英] pyPDF2 TypeError when trying to extract text

查看：140 发布时间：2020/7/4 21:27:41 pdf python-3.x pypdf

本文介绍了pyPDF2 TypeError尝试提取文本时的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经成功安装了pyPDF，但是extractText方法不能很好地工作，所以我决定尝试pyPDF2，问题是，提取文本时出现异常:

I have successfully installed pyPDF, but the extractText method does not work well, so i decided to try pyPDF2, the problem is, when extracting text there is an exception:

Traceback (most recent call last):
  File "C:\Users\Asus\Desktop\pfdtest.py", line 44, in <module>
    test2()
  File "C:\Users\Asus\Desktop\pfdtest.py", line 41, in test2
    print(mypdf.getPage(0).extractText())
  File "C:\Python32\lib\site-packages\PyPDF2\pdf.py", line 1701, in extractText
    content = ContentStream(content, self.pdf)
  File "C:\Python32\lib\site-packages\PyPDF2\pdf.py", line 1783, in __init__
    stream = StringIO(stream.getData())
TypeError: initial_value must be str or None, not bytes

这是我的示例代码:

filename = "myfile.pdf"
f = open(filename,'rb')
mypdf = PdfFileReader(f)
print(f,mypdf,mypdf.getNumPages())
print(mypdf.getPage(0).extractText())

它可以正确确定pdf中的页面数量，但是在读取流时存在问题.

It correctly determines the amount of pages in the pdf, but it has a problem with reading the stream.

pyPDF2 TypeError尝试提取文本时 [英] pyPDF2 TypeError when trying to extract text

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

pyPDF2 TypeError尝试提取文本时 [英] pyPDF2 TypeError when trying to extract text

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭