使用插入符和数据表创建数据分区 [英] Creating a data partition using caret and data.table

查看:107
本文介绍了使用插入符和数据表创建数据分区的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个data.table在R,我想用caret包

  set.seed(42)
trainingRows< -createDataPartition(DT $ variable,p = 0.75,list = FALSE)
head(trainingRows)#查看行号样本

但是,我不能选择与data.table的行。我必须转换为data.frame

  DT_df <-as.data.frame(DT)
DT_train< -DT_df [trainingRows,]
dim(DT_train)

  DT_train<  -  DT [。(trainingRows),]需要设置键。 

除了转换为data.frame之外,还有更好的选择吗?

createDataPartition 产生具有两个维度的整数向量,其中第二个可以无损地丢弃。

您可以简单地减少 trainingRows 的尺寸:



  DT [trainingRows [,1]] 

来自Bruce Pucci的 c()函数也会减小尺寸。



data.frame很久以前被发现,最近我做了公关#1275 填补这个空白。


I have a data.table in R which I want to use with caret package

set.seed(42)
trainingRows<-createDataPartition(DT$variable, p=0.75, list=FALSE)
head(trainingRows) # view the samples of row numbers

However, I am not able to select the rows with data.table. Instead I had to convert to a data.frame

DT_df <-as.data.frame(DT)
DT_train<-DT_df[trainingRows,]
dim(DT_train)

the data.table alternative

DT_train <- DT[.(trainingRows),] requires the keys to be set.

Any better option other than converting to data.frame?

解决方案

The reason is that createDataPartition produces integer vector with two dimensions where the second could be losslessly dropped.
You can simply reduce dimension of trainingRows using below:

DT[trainingRows[,1]]

The c() function from Bruce Pucci's answer will reduce dimension too.

This minor difference vs. data.frame was spotted long time ago and recently I've made PR #1275 to fill that gap.

这篇关于使用插入符和数据表创建数据分区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆