R 错误:“名称"属性 [1] 的长度必须与向量 [0] 的长度相同 [英] R Error: 'names' attribute [1] must be the same length as the vector [0]

查看:158
本文介绍了R 错误:“名称"属性 [1] 的长度必须与向量 [0] 的长度相同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在一个文件夹 (C:/Users/Documents/files_i_want") 中有许多 pdf 文件(这些文件是扫描的").PDF 的文件如下所示:https://jeroen.github.io/images/ocrscan.pdf

I have a number of pdf files (these are "scanned") in a folder ("C:/Users/Documents/files_i_want"). The PDF's files look like this: https://jeroen.github.io/images/ocrscan.pdf

所有的pdf文件都有不同的名字.我正在尝试使用以下命令将它们同时全部导入 R:pdftools::pdf_convert

All the pdf files have different names. I am trying to import them all into R at the same time, by using the following command: pdftools::pdf_convert

library(pdftools) 
    library(tesseract)

#Get the path of filenames

filenames <- list.files("C:/Users/Documents/files_i_want", full.names = TRUE)

#Read them in a list

list_data <- lapply(filenames,  pdftools::pdf_convert)

#Name them as per your choice (df_1, df_2 etc)

names(list_data) <- paste('df', seq_along(filenames), sep = '_')

#Create objects in global environment.

list2env(list_data, .GlobalEnv)

这将返回以下错误:

Error in names(list_data) <- paste("df", seq_along(filenames), sep = "_") : 
  'names' attribute [1] must be the same length as the vector [0]

有谁知道为什么会产生这个错误?

Does anyone know why this error is being produced?

谢谢

更新

我想出了如何从文件夹上传所有 pdf:

I figured out how to upload all the pdf's from the folder:

library(pdftools)
library(tesseract)

directory <- "C:/Users/OneDrive/Documents/files_i_want"

file.list <- paste(directory, "/",list.files(directory, pattern = "*.pdf"), sep = "")

b = lapply(file.list, FUN = function(files) {
    pdf_convert(files, format = "jpeg")
})

a = data.frame(file.list)

现在,我必须弄清楚如何在每个条目"上应用以下函数:在对象a"内,例如我"代表每个条目"在对象a"内(目标是创建text_1"和text_2",例如 text_1 <- tesseract::ocr(a[1, file.list"]) )

Now, I have to figure out how to apply the following function on each "entry" within the object "a", e.g. "i" represents each "entry" within object "a" (the goal is to create "text_1" and "text_2", e.g. text_1 <- tesseract::ocr(a[1, "file.list"]) )

convert_function <- function(i){
text_i <- tesseract::ocr(i)
}

推荐答案

你可以试试这个代码-

data <- lapply(sub('pdf$', 'jpeg', file.list),  tesseract::ocr)
names(data) <- paste0('text', seq_along(data))
list2env(data, .GlobalEnv)

这篇关于R 错误:“名称"属性 [1] 的长度必须与向量 [0] 的长度相同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆