将具有可变列类型的多个.csv文件导入到R中 [英] Importing multiple .csv files with variable column types into R

查看:141
本文介绍了将具有可变列类型的多个.csv文件导入到R中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何正确构建一个文件(从一个目录中读取)所有的.csv文件,将所有的列加载成字符串,然后将它们绑定到一个数据框中。

How can I properly build an lapply to read (from out of one directory) all the .csv files, load all the columns as strings and then bind them into one data frame.

根据,我有一种方法可以将所有.csv文件加载并绑定到数据框中。不幸的是,他们正在挂起这些列变成类型转换的不同之处。因此给我这个错误:

Per this, I have a way to get all the .csv files loaded and bound into a dataframe. Unfortunately they are getting hung up on the variablity of how the columns are getting type cast. Thus giving me this error:


错误:不能自动从
列中的字符转换为整数

Error: Can not automatically convert from character to integer in column

我尝试用数据类型的参数,我试图将所有内容都保留为字符;我能够正确地获得我的循环,以有效地参考其循环的每个循环的主题。

I have tried supplementing the code with the arguments for data type and am trying to just keep everything as characters; I am getting stuck now on being able to properly get my lapply 'loop' to effectively reference the subject of each cycle of its 'loop'.

srvy1 <- structure(list(RESPONSE_ID = 584580L, QUESTION_ID = 328L, SURVEY_ID = 2324L, 
           AFF_ID_INV_RESP = 5L), .Names = c("RESPONSE_ID", "QUESTION_ID", 
                                             "SURVEY_ID", "AFF_ID_INV_RESP"), class = "data.frame", row.names = c(NA, 
                                                                                                                  -1L))

srvy2 <- structure(list(RESPONSE_ID = 584580L, QUESTION_ID = 328L, SURVEY_ID = 2324L, 
           AFF_ID_INV_RESP = "bovine"), .Names = c("RESPONSE_ID", "QUESTION_ID", 
                                                   "SURVEY_ID", "AFF_ID_INV_RESP"), class = "data.frame", row.names = c(NA, 
                                                                                                                        -1L))    

files = list.files(pattern="*.csv")
tbl = lapply(files, read_csv(files, col_types = cols(.default = col_character()))) %>% bind_rows

有一个简单的修复为此,我可以保持统一,或者我必须下降一个级别,并自行建立for循环 - 根据这个

Is there an easy fix for this that I can keep in tidyverse, or must I drop down a level and go into openly building the for loop myself - per this.

推荐答案

lapply 应该是 lapply(x,FUN,...)其中 ... 是传递给 FUN 。您正在填写FUN中的参数。它应该是 lapply(files,read_csv,col_types = cols(.default =c))

The lapply should be the form lapply(x, FUN, ...) where ... is the arguments passed to FUN. You're filling the arguments within FUN. It should be lapply(files, read_csv, col_types = cols(.default = "c"))

如果您喜欢 tidyverse 解决方案:

files %>%
  map_df(~read_csv(.x, col_types = cols(.default = "c")))

这将把整个东西绑定到数据框中。

Which will bind the whole thing into a data frame at the end.

这篇关于将具有可变列类型的多个.csv文件导入到R中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆