导入fread与read.table和错误 [英] Importing fread vs read.table and errors

查看：278 发布时间：2018/8/1 12:06:37 r import read.table

本文介绍了导入fread与read.table和错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

当我使用read.table导入.csv文件时，调用 df< - read.table（ModelSugar（new）real_thesis_experiment-table_1.csv，skip = 6，sep = ，，head = TRUE）我检查了我得到的数据摘要（只显示了45列的前3列）：

When I import a .csv file with read.table, with the call df <- read.table("ModelSugar(new) real_thesis_experiment-table_1.csv", skip = 6, sep = ",", head = TRUE) and I check the summary of the data I get (only first 3 columns of 45 are shown):

 X.run.number. scenario        configuration   
 Min.   :   1 "pessimistic":999994   "central":999994  
 1st Qu.: 650                                            
 Median :1299                                            
 Mean   :1299                                            
 3rd Qu.:1949                                            
 Max.   :2600

使用这个数据帧我可以制作漂亮的图形。但是，我有80个.csv文件，总大小为40 GB，所以我只想导入特定列。

With this dataframe I can make nice graphics. However, I have 80 .csv files with a total size of 40 GB, so I want to import only specific columns.

我认为这会更容易使用 fread （来自data.table包）。所以我导入了5列并将它们一起调整到一个数据帧中，调用

I figured this would be easier with fread (from the data.table package). So I imported 5 columns and rbind them together into one dataframe with the call

my.files <- list.files(pattern=".csv")
my.data <- lapply(my.files,fread, header = FALSE, select = c(1,2,3,25,29), sep=",") 
df <- do.call("rbind", my.data)

摘要该数据帧看起来像（显示5列中的4列：

The summary of that dataframe looks like(4 of 5 columns shown:

[run number]         scenario         configuration         [step]         
 Length:999994      Length:999994      Length:999994      Length:999994     
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character

使用这个数据帧，我无法使用read.table创建图形。我猜这与图像的类有关。列的值。

With this dataframe I cannot make the graphics that I could with read.table. I guess that this has to do with the class of the columns' values.

如何确保使用fread创建的数据帧具有相同的ch具有read.table的特性，以便我可以制作我想要的图形？

How can I make sure that the dataframe created with fread has the same characteristics as the one with read.table, so that I can make the graphics I want?

编辑

我发现当我第一次拆分.csv时excel到列然后使用sep =;的fread调用而不是sep =，，它确实有效。奇怪......而且我不想手动将.csv文件转换为excel中的列。

I found out that when I first split the .csv in excel into columns and then use the fread call with sep = ";" instead of sep = ",", that it does work. Strange... And I don't want to convert the .csv files into columns in excel manually.

推荐答案

你能做什么do是用write.csv读取一个文件并保存该文件的10行作为模板然后你可以执行以下操作 -

What you can do is read one file with write.csv and save 10 rows of that file as template and then you can do the following-

## Getting your files using fread
dfshort <- read.table("ModelSugar(new) real_thesis_experiment-table_1.csv", skip = 6, sep = ",", nrows = 10, head = TRUE)
df_needed<-dfshort[1:10]
template <- subset(df_needed,select=c(columns_required)) ##select whatever cols you need

##Read you large files using fread
my.files <- list.files(pattern=".csv")
my.data <- lapply(my.files,fread, header = FALSE, select = c(1,2,3,25,29), sep=",") 
df <- do.call("rbind", my.data)

## changing cols types as per your template
result = data.frame(
  lapply(setNames(,names(template)), function(x) 
    if (x %in% names(df)) as(df[[x]], class(template[[x]])) 
    else template[[x]][NA_integer_]
  ), stringsAsFactors = FALSE)

然后，您可以使用它进行绘图，因为它将使用write.csv获得相同的类类型。

Then, you can use it to plot because it will have same class types which you get using write.csv.

dfshort <- read.table("ModelSugar(new) real_thesis_experiment-table_1.csv", skip = 6, sep = ",", nrows = 10, head = TRUE)
    template <- copy(dfshort)
    my.files <- list.files(pattern=".csv")
    my.data <- lapply(my.files,fread, header = FALSE, colClasses = c(1,2,3,25,29), sep=",") 
    df <- do.call("rbind", my.data)

    result = data.frame(
      lapply(setNames(,names(template)), function(x) 
        if (x %in% names(df)) as(df[[x]], class(template[[x]])) 
        else template[[x]][NA_integer_]
      ), stringsAsFactors = FALSE)

这篇关于导入fread与read.table和错误的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

导入fread与read.table和错误 [英] Importing fread vs read.table and errors

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

导入fread与read.table和错误 [英] Importing fread vs read.table and errors

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭