fread下降:错过col名称的重复(data.table R) [英] drop in fread: misses repetitions of col name (data.table R)
问题描述
我有一个文件,其中包含一堆填充列(当然,命名为 filler
),我正尝试使用进行读取fread
。
I've got a file with a bunch of filler columns (named, of course, filler
) that I'm trying to read with fread
.
我正在使用 drop
参数,但它只会删除遇到的第一个(大概是左右,但这无关紧要)实例;我希望它摆脱所有这些问题。
I'm using the drop
argument, but it only drops the first (presumably left-right, but this is irrelevant) instance it encounters; I want it to get rid of all of these.
快速示例:
标题为 .csv
:
id,first_name,last_name,filler,birth_year,filler,position,filler,wage
names(dt)
在 fread
中使用 drop
:
id,first_name,last_name,birth_year,filler,position,filler,wage
更多,如果我只是尝试:
Further, if I just try:
DT <- fread("file.csv", drop = rep("filler", 5L))
我得到一个错误:
fread(paste0(substr(tt,3,4), staff.csv),drop = rep( filler,
:
在drop中检测到重复项
Error in
fread(paste0(substr(tt, 3, 4), "staff.csv"), drop = rep("filler",
: Duplicates detected in drop
是否有指针?
推荐答案
您可以使用 scan()
读取文件的第一行,然后将该数据用作在
索引> fread()
You could read the first line of the file with scan()
, and then use that data as the drop
indices in fread()
## example text for fread()
x <- "id,first_name,last_name,filler,birth_year,filler,position,filler,wage
1,2,3,4,5,6,7,8,9"
## read the first line and find the filler
f <- scan(text = x, what = "", sep = ",", nlines = 1) == "filler"
## pass to fread()
fread(x, drop = which(f))
# id first_name last_name birth_year position wage
# 1: 1 2 3 5 7 9
这篇关于fread下降:错过col名称的重复(data.table R)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!