我想用列类的列表/向量重新分配128个列类吗? [英] I would like to reassign 128 column classes with a list/vector of column classes?

查看:76
本文介绍了我想用列类的列表/向量重新分配128个列类吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从本质上来说,我似乎找不到其他职位所需的东西


  1. 我需要从数据中重新排序数据.table读入(由于列不整齐,我无法给出col类fread语句)

  2. 我需要将列类更改为下面需要的列。
  3. li>

很多其他帖子似乎正在将一种类型的班级全部更改为另一种类型:



更改以下类别数据框中的许多列





我在向量c中有列名

  selectColumns<-c(giantListofColumnsGoesHere)
DT<-fread( DT.csv,select = selectColumns,na.strings = NAsList)

setcolorder(DT,selectColumns)
colClasses<-list('character','character','character ','factor','numeric','character','numeric','integer','integer','integer','integer','numeric','numeric','factor','factor', 'factor','逻辑','integer','数字','factor','integer','integer','integer','factor','factor','factor','factor','factor ','整数','整数','factor','integer','factor','factor','integer','factor','numeric','factor','numeric','character', 'factor','factor','factor','factor','factor','factor','factor','factor','factor','factor','integer','factor','numeric ','factor','factor','character','factor','factor','factor','integer','numeric','integer','integer','integer','integer', '整数','因子','字符','因子','因子','因子','因子','整数','因子','因子','字符','i nteger','integer','integer','逻辑','逻辑','逻辑','逻辑','逻辑','逻辑','逻辑','逻辑','逻辑','逻辑' ,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑','逻辑','逻辑','逻辑','逻辑','逻辑','逻辑','逻辑','逻辑','逻辑','逻辑','逻辑','逻辑' ,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑,逻辑','逻辑')

#现在我不知道的部分,我已经尝试过:
lapply(DT,class)<-colClasses
#或
attr(DT,class)<-colClasses
#显然attr(DT,class)只是给出了 data.table data.frame

但是我需要对DT的column属性进行子集化才能以某种方式获取较低级别的列表,但是我对列表并不满意,而且我似乎无法想出解决办法。很抱歉,这是一个太简单的问题,并且已经基本上得到了回答,但是我迷路了,而且似乎通常有一种简单的方法可以做到这一点。



很抱歉我不能提供数据,因为它包含私人信息。



感谢大家的帮助。

解决方案

假设OP忘记在 fread colClasses >或在使用上存在任何技术困难,并且想要更改 data.table ,请使用 set 将是一个选项

  for(j在seq_along(selectColumns)中} {
set(DT,i = NULL,j = selectColumns [j],value = get(colClasses [j])(DT [[selectselects [j]]]))
}

str(DT)
#类'data.table'和'data.frame':5个对象6个变量:
#$ V1:数字1 2 3 4 5
#$ V2:chr A B C D ...
#$ V3: int 1 2 3 4 5
#$ V4:字符 F G H I ...
#$ V5:字符 G H I J ...
#$ V6:具有5个级别的因子 6, 7, 8, 9,..:1 2 3 4 5

请注意, selectColumns的初始

  str(DT)
#类'data.table'和'data.frame':5个对象6个变量:
#$ V1:整数1 2 3 4 5
#$ V2:chr A B C D ...
#$ V3: num 1 2 3 4 5
#$ V4:chr F G H I ...
#$ V5:chr G H I J ...
#$ V6:整数6 7 8 9 10



数据



  DT <-data.table(V1 = 1:5,V2 = LETTERS [1:5],V3 = as.numeric (1:5),
V4 =字母[6:10],V5 =字母[7:11],V6 = 6:10)
colClasses<-paste0( as。,c (数字,整数,因子))
selectColumns<-c( V1, V3, V6)

注意:在 colClasses向量中添加了 as。。如果要将 factor转换为 numeric,则必须分两个步骤进行操作,即首先转换为 character,然后转换为 numeric(基于@Frank在评论中的建议)


I can't seem to find what I need in other posts, essentially,

  1. I need to reorder my data from the data.table read in (I can't give the col classes fread statement because my columns are out of order)
  2. I need to change the columns classes to what I need listed below.

A lot of the other posts seem to be changing all of one type of class to another type of class:

Change the class of many columns in a data frame

Convert column classes in data.table

I believe my problem is different because there is no "change all factors to characters" etc. Each column has a specific class that I must change to ahead of time.

I have my column names in a vector called selectColumns that I pass to fread.

selectColumns <- c(giantListofColumnsGoesHere)
DT <- fread("DT.csv", select=selectColumns, na.strings=NAsList)

setcolorder(DT, selectColumns)
colClasses <- list('character','character','character','factor','numeric','character','numeric','integer','integer','integer','integer','numeric','numeric','factor','factor','factor','logical','integer','numeric','factor','integer','integer','integer','factor','factor','factor','factor','factor','integer','integer','factor','integer','factor','factor','integer','factor','numeric','factor','numeric','character','factor','factor','factor','factor','factor','factor','factor','factor','factor','factor','integer','factor','numeric','factor','factor','character','factor','factor','factor','integer','numeric','integer','integer','integer','integer','integer','factor','character','factor','factor','factor','factor','integer','factor','factor','character','integer','integer','integer','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical','logical')

#Now the part I can't figure out, I've tried:
lapply(DT, class) <- colClasses
#OR
attr(DT, class) <- colClasses
#Obviously attr(DT, class) just gives "data.table" "data.frame"

But I need to subset the DT's column attributes to get the lower level lists somehow, but I'm not great with lists and I can't seem to figure this out. I'm sorry if this is too easy of a question and already been answered essentially, but I'm lost and it seems like there is usually an easy way to do this.

I'm sorry I can't give data because this it contains private information.

Thanks for any help everyone.

解决方案

Suppose if the OP forgot to use colClasses inside fread or if there is any technical difficulty in using that and wants to change the class of the data.table, using set will be an option

for(j in seq_along(selectColumns)){
     set(DT, i= NULL, j=selectColumns[j], value = get(colClasses[j])(DT[[selectColumns[j]]]))
 } 

str(DT)
#Classes ‘data.table’ and 'data.frame':  5 obs. of  6 variables:
#$ V1: num  1 2 3 4 5
#$ V2: chr  "A" "B" "C" "D" ...
#$ V3: int  1 2 3 4 5
#$ V4: chr  "F" "G" "H" "I" ...
#$ V5: chr  "G" "H" "I" "J" ...
#$ V6: Factor w/ 5 levels "6","7","8","9",..: 1 2 3 4 5

Note that the initial class for the "selectColumns" were

str(DT)
#Classes ‘data.table’ and 'data.frame':  5 obs. of  6 variables:
#$ V1: int  1 2 3 4 5
#$ V2: chr  "A" "B" "C" "D" ...
#$ V3: num  1 2 3 4 5
#$ V4: chr  "F" "G" "H" "I" ...
#$ V5: chr  "G" "H" "I" "J" ...
#$ V6: int  6 7 8 9 10

data

 DT <- data.table(V1= 1:5, V2 = LETTERS[1:5], V3 = as.numeric(1:5),
          V4 = LETTERS[6:10], V5 = LETTERS[7:11], V6 = 6:10)
 colClasses <- paste0("as.",c("numeric", "integer", "factor"))
 selectColumns <- c("V1", "V3", "V6")

NOTE: Added as. to "colClasses" vector to make the conversion. If we are converting 'factor' to 'numeric', then we have to do this in two steps, i.e. first convert to 'character' and then to 'numeric' (Based on @Frank's suggestion in the comments)

这篇关于我想用列类的列表/向量重新分配128个列类吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆