R:创建一个(长)虚拟列表时的警告 [英] R: Warning when creating a (long) list of dummies

查看:134
本文介绍了R:创建一个(长)虚拟列表时的警告的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

c 和给定值 x 等于 1 if c == x 和0 else。通常,通过为列 c 创建虚拟对象,在选择时排除一个值 x ,作为最后一个虚拟列添加任何信息wrt已经存在的虚拟列。

A dummy column for a column c and a given value x equals 1 if c==x and 0 else. Usually, by creating dummies for a column c, one excludes one value x at choice, as the last dummy column doesn't add any information w.r.t. the already existing dummy columns.

以下是我试图为列创建一个长列的虚拟对象 firm data.table

Here's how I'm trying to create a long list of dummies for a column firm, in a data.table:

values <- unique(myDataTable$firm)
cols <- paste('d',as.character(inds[-1]), sep='_') # gives us nice d_value names for columns
# the [-1]: I arbitrarily do not create a dummy for the first unique value
myDataTable[, (cols):=lapply(values[-1],function(x)firm==x)]

此代码可靠地用于具有较小唯一值的先前列。 公司但是更大:

This code reliably worked for previous columns, which had smaller unique values. firm however is larger:

tr(values)
 num [1:3082] 51560090 51570615 51603870 51604677 51606085 ...

添加以下列:

Warning message:
  truelength (6198) is greater than 1000 items over-allocated (length = 36). See ?truelength. If you didn't set the datatable.alloccol option very large, please report this to datatable-help including the result of sessionInfo().

据我所知,仍然有我需要的所有列。我可以忽略这个问题吗?它会减缓未来的计算吗?我不知道该做什么和 truelength 的相关。

As far as I can tell, there is still all columns that I need. Can I just ignore this issue? Will it slow down future computations? I'm not sure what to make of this and the relevant of truelength.

推荐答案

以Arun的注释为答案。

您应该使用 alloc.col 函数在数据中预分配所需的列数。表的数字将大于预期的ncol。

Taking Arun's comment as an answer.
You should use alloc.col function to pre-allocate required amount of columns in your data.table to the number which will be bigger than expected ncol.

alloc.col(myDataTable, 3200)

此外,根据您使用数据的方式,我建议您考虑将宽表重新格式化为长表, EAV 。然后,您需要每个数据类型只有一列。

Additionally depending on the way how you consume the data I would recommend to consider reshaping your wide table to long table, see EAV. Then you need to have only one column per data type.

这篇关于R:创建一个(长)虚拟列表时的警告的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆