如何在R中逐行写入文件 [英] How to write a file line by line in R

查看:88
本文介绍了如何在R中逐行写入文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试逐行读取一个csv文件,并且仅从左侧选择第2个和第3个单元格,从右侧选择第3个单元格.例如,如果此行中有17个单元格,那么我将采用第15个单元格.然后,我想合并这三个用逗号分隔的单元格,然后将此行写入新的csv文件.

I am trying to read a csv file line by line and only select the 2nd and the 3rd cell from left, and the 3rd cell from the right. For example, if there are 17 cells in this line, I am going to take the 15th cell. Then I want to combine those 3 cells, separated by comma, and then to write this line to a new csv file.

现在,我只是使用for循环访问每一行,然后用逗号将它们分开.然后,选择所需的单元格并将它们组合为字符串,然后附加到一个大的String变量中. for循环完成后,我通过writeLines()写出文件.但是,由于有280万行并且占用大量内存,因此需要很长时间才能完成此过程.有什么方法可以使其更高效?还是可以在for循环中逐行写入输出文件?

Foe now, I am just using a for loop to access each line and then split them by comma. Then I select the cells I want and combine them as a string and append to a big String variable. Once the for-loop finishes, I write out the file by writeLines(). However, it takes a long time to finish this process because there are 2.8 million rows and it takes a lot of memory. Is there any way to make it more efficient? or can I write the output file line by line in the for-loop?

FileLinebyLine <- read_lines("testfile.csv")

pt<-proc.time()
NewFile <- ""
RowList <- list()
for (i in 1:length(FileLinebyLine))
{
    a <- strsplit(FileLinebyLine[i],",")
    RowList[i] = paste(a[[1]][2],a[[1]][3],a[[1]][(length(a[[1]]) - 2)], sep = ",")

}
NewFile <- paste(unlist(RowList), sep = "\n")
proc.time()-pt
outputfile <- file("output.txt")
writeLines(NewFile,outputfile)
close(outputfile)

我也尝试过在for循环中使用write_lines(),但是它总是给我错误Error in

I have also tried to use write_lines() in the for loop but it always gives me the error Error in

isOpen(path):无效的连接

isOpen(path) : invalid connection

有人可以帮助我吗?谢谢!!!

Can anyone help me? Appreciate that!!!

推荐答案

是的,您可以逐行读写,尽管我不知道它的速度如何.这是一个示例,它逐行读取文件,每一行中的第四项,然后一次写入一行新文件:

Yes you can read and write line by line, although I don't know how fast it will be. Here's an example that read a file line by line, the 4th item in every line and writes to a new file one line at a time:

con = file("temp.csv", "r")
while(length(x <- readLines(con, n = 1)) > 0) {
    write(strsplit(x,",")[[1]][4], file="out.csv", append=T)
}
close(con)

temp.csv

a,b,c,d,e,f,g,h
x,y,z,a,b,c,d,e
1,2,3,4,5,6,7,8
q,w,e,r,t,y,u,i

out.csv

d
a
4
r

希望有帮助.

您还可以添加library(compiler); enableJIT(3)以稍微加快循环速度.

You can also add library(compiler); enableJIT(3) to speed up your loops a little.

这篇关于如何在R中逐行写入文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆