我想用列类的列表/向量重新分配128个列类吗? [英] I would like to reassign 128 column classes with a list/vector of column classes?
问题描述
从本质上来说,我似乎找不到其他职位所需的东西
- 我需要从数据中重新排序数据.table读入(由于列不整齐,我无法给出col类fread语句)
- 我需要将列类更改为下面需要的列。 li>
很多其他帖子似乎正在将一种类型的班级全部更改为另一种类型:
DT <-data.table(V1 = 1:5,V2 = LETTERS [1:5],V3 = as.numeric (1:5),
V4 =字母[6:10],V5 =字母[7:11],V6 = 6:10)
colClasses<-paste0( as。,c (数字,整数,因子))
selectColumns<-c( V1, V3, V6)
注意:在 colClasses向量中添加了 as。
。如果要将 factor转换为 numeric,则必须分两个步骤进行操作,即首先转换为 character,然后转换为 numeric(基于@Frank在评论中的建议)
I can't seem to find what I need in other posts, essentially,
- I need to reorder my data from the data.table read in (I can't give the col classes fread statement because my columns are out of order)
- I need to change the columns classes to what I need listed below.
A lot of the other posts seem to be changing all of one type of class to another type of class:
Change the class of many columns in a data frame
Convert column classes in data.table
I believe my problem is different because there is no "change all factors to characters" etc. Each column has a specific class that I must change to ahead of time.
I have my column names in a vector called selectColumns that I pass to fread.
selectColumns <- c(giantListofColumnsGoesHere)
DT <- fread("DT.csv", select=selectColumns, na.strings=NAsList)
setcolorder(DT, selectColumns)
colClasses <- list('character','character','character','factor','numeric','character','numeric','integer','integer','integer','integer','numeric','numeric','factor','factor','factor','logical','integer','numeric','factor','integer','integer','integer','factor','factor','factor','factor','factor','integer','integer','factor','integer','factor','factor','integer','factor','numeric','factor','numeric','character','factor','factor','factor','factor','factor','factor','factor','factor','factor','factor','integer','factor','numeric','factor','factor','character','factor','factor','factor','integer','numeric','integer','integer','integer','integer','integer','factor','character','factor','factor','factor','factor','integer','factor','factor','character','integer','integer','integer','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical')
#Now the part I can't figure out, I've tried:
lapply(DT, class) <- colClasses
#OR
attr(DT, class) <- colClasses
#Obviously attr(DT, class) just gives "data.table" "data.frame"
But I need to subset the DT's column attributes to get the lower level lists somehow, but I'm not great with lists and I can't seem to figure this out. I'm sorry if this is too easy of a question and already been answered essentially, but I'm lost and it seems like there is usually an easy way to do this.
I'm sorry I can't give data because this it contains private information.
Thanks for any help everyone.
Suppose if the OP forgot to use colClasses
inside fread
or if there is any technical difficulty in using that and wants to change the class
of the data.table
, using set
will be an option
for(j in seq_along(selectColumns)){
set(DT, i= NULL, j=selectColumns[j], value = get(colClasses[j])(DT[[selectColumns[j]]]))
}
str(DT)
#Classes ‘data.table’ and 'data.frame': 5 obs. of 6 variables:
#$ V1: num 1 2 3 4 5
#$ V2: chr "A" "B" "C" "D" ...
#$ V3: int 1 2 3 4 5
#$ V4: chr "F" "G" "H" "I" ...
#$ V5: chr "G" "H" "I" "J" ...
#$ V6: Factor w/ 5 levels "6","7","8","9",..: 1 2 3 4 5
Note that the initial class
for the "selectColumns" were
str(DT)
#Classes ‘data.table’ and 'data.frame': 5 obs. of 6 variables:
#$ V1: int 1 2 3 4 5
#$ V2: chr "A" "B" "C" "D" ...
#$ V3: num 1 2 3 4 5
#$ V4: chr "F" "G" "H" "I" ...
#$ V5: chr "G" "H" "I" "J" ...
#$ V6: int 6 7 8 9 10
data
DT <- data.table(V1= 1:5, V2 = LETTERS[1:5], V3 = as.numeric(1:5),
V4 = LETTERS[6:10], V5 = LETTERS[7:11], V6 = 6:10)
colClasses <- paste0("as.",c("numeric", "integer", "factor"))
selectColumns <- c("V1", "V3", "V6")
NOTE: Added as.
to "colClasses" vector to make the conversion. If we are converting 'factor' to 'numeric', then we have to do this in two steps, i.e. first convert to 'character' and then to 'numeric' (Based on @Frank's suggestion in the comments)
这篇关于我想用列类的列表/向量重新分配128个列类吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!