只导入R中.csv文件的每第N行 [英] Importing only every Nth row from a .csv file in R

查看:256
本文介绍了只导入R中.csv文件的每第N行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

只是一个快速问题。有一种方法来使用read.csv从一个大文件中导入每个第N行:

just a quick question. Is there a way to use read.csv to import every Nth row from a large file:

例如,一个50-60万行文件,你只需要每第4行从第2行开始。

Example, a 50-60 million line file where you only need every 4th row starting at row 2.

我想到可能会加入'seq'函数,但我不知道这是否可能。

I thought about maybe incorporating the 'seq' function, but I am not sure if that is possible.

有任何建议吗?

推荐答案

对于大型数据文件,最好的选择是过滤掉不必要的行,这是通过OS命令的手段,如sed,awk,grep等。以下代码从文件中读取每第4行:例如:

For a large data file the best option is to filter out unnecessary row before they get imported into R. The simplest way to do this is by the means of the OS commands, like sed, awk, grep etc. The following code reads every 4th line from the file: for example:

write.csv(1:1000, file='test.csv')

file.pipe <- pipe("awk 'BEGIN{i=0}{i++;if (i%4==0) print $1}' < test.csv ")
res <- read.csv(file.pipe)
res

> res
     X3 X3.1
1     7    7
2    11   11
3    15   15
4    19   19
5    23   23
6    27   27
7    31   31
8    35   35

这篇关于只导入R中.csv文件的每第N行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆