通过软件包"tm"在R中导入pdf. [英] Importing pdf in R through package "tm"

查看:138
本文介绍了通过软件包"tm"在R中导入pdf.的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道通过"tm"包在"R"工作空间中获取pdf的实际示例,但无法理解代码的工作方式,因此无法导入所需的pdf.在以下代码中导入的pdf是"tm"小插图.

I know the practical example to get pdf in "R" workspace through package "tm" but not able to understand how the code is working and thus not able to import the desired pdf. The pdf imported in the following code is "tm" vignette.

代码是

if(file.exists(Sys.which("pdftotext"))) {
    pdf <- readPDF(PdftotextOptions = "-layout")(elem = list(uri = vignette("tm")$pdf),
                                                 language = "en",
                                                 id = "id1")
    pdf[1:13]
}

"tm"是小插图.我尝试带来的pdf是不同的".因此,如何更改以上代码以将我的pdf带入工作区. minn 是我要导入的pdf文档.

The "tm" is vignette. While the pdf which I am trying to bring is "different". So how to change the above code to bring my pdf in the workspace. minn is the pdf document which I am trying to import.

喜欢

if(file.exists(Sys.which("pdftotext"))) {
        pdf <- readPDF(PdftotextOptions = "-layout")(elem = list(uri = vignette("minn")$pdf),
                                                     language = "en",
                                                     id = "id1")
        pdf[1:13]
    }

推荐答案

因此,似乎是我尝试阅读的PDF出现了问题.但是,代码如下所示.感谢Thomas的带领. pdf的链接为" http://www.wine-economics.org/workingpapers/AAWE_WP16 .pdf "

So it seems that problem is with the PDF which I was trying to read. However the code goes like the below. Thanks Thomas for the lead. The link for pdf is "http://www.wine-economics.org/workingpapers/AAWE_WP16.pdf"

tt <- readPDF(PdftotextOptions="-layout")
rr <- tt(elem=list(uri="AAWE_WP16.pdf"),language="en",id="id1")
rr[1:15]

这篇关于通过软件包"tm"在R中导入pdf.的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆