rbindlist data.tables具有不同的列数 [英] rbindlist data.tables wtih different number of columns
本文介绍了rbindlist data.tables具有不同的列数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想知道如何使用不同数目的列来rbindlist数据表,并用像rbind.fill这样的NAs填充空行。
I am wondering how do I rbindlist data tables with different number of columns, and filling up empty rows with NAs like rbind.fill
DT1 = data.table(A=1:3)
DT2 = data.table(A=4:5,B=letters[4:5])
l = list(DT1,DT2)
rbindlist(l)
Error in rbindlist(l) :
Item 2 has 2 columns, inconsistent with item 1 which has 1 columns
我想得到的是
A B
1: 1 NA
2: 2 NA
3: 3 NA
4: 4 d
5: 5 e
推荐答案
此功能现在在 commit 1266 of v1.9.3 。从新闻:
This feature is now implemented in commit 1266 of v1.9.3. From NEWS:
o 'rbindlist' gains 'use.names' and 'fill' arguments and is now implemented
entirely in C. Closes #5249
-> use.names by default is FALSE for backwards compatibility (doesn't bind by
names by default)
-> rbind(...) now just calls rbindlist() internally, except that 'use.names'
is TRUE by default, for compatibility with base (and backwards compatibility).
-> fill by default is FALSE. If fill is TRUE, use.names has to be TRUE.
-> At least one item of the input list has to have non-null column names.
-> Duplicate columns are bound in the order of occurrence, like base.
-> Attributes that might exist in individual items would be lost in the bound result.
-> Columns are coerced to the highest SEXPTYPE, if they are different, if/when possible.
-> And incredibly fast ;).
-> Documentation updated in much detail. Closes DR #5158.
Check this post for benchmarks.
1)使用填充参数
rbindlist
:
DT1 <- data.table(x=1, y=2)
DT2 <- data.table(y=2, z=-1)
rbindlist(list(DT1, DT2), fill=TRUE)
# x y z
# 1: 1 2 NA
# 2: NA 2 -1
请注意,当 fill = TRUE
, use.names
应为 TRUE
。
2)适当地绑定具有重复名称的表:
2) Binding tables with duplicate names appropriately:
DT1 <- data.table(x=1, x=2, y=1, y=2)
DT2 <- data.table(y=3, y=-1, y=-2)
rbindlist(list(DT1, DT2), fill=TRUE)
# x x y y y
# 1: 1 2 1 2 NA
# 2: NA NA 3 -1 -2
3)它不仅限于 data.tables
,但适用于 data.frames
和 lists
:
DT1 <- data.table(x=1, y=2)
DT2 <- data.frame(y=2, z=-1)
DT3 <- list(z=10)
rbindlist(list(DT1,DT2,DT3), fill=TRUE)
# x y z
# 1: 1 2 NA
# 2: NA 2 -1
# 3: NA NA 10
4)只是通过名称,你可以设置只是 use.names = TRUE
,但不是填充
:
DT1 <- data.table(x=1, y=2)
DT2 <- data.table(y=1, x=2)
rbindlist(list(DT1,DT2), use.names=TRUE, fill=FALSE)
# x y
# 1: 1 2
# 2: 2 1
DT1 <- data.table(x=1, y=2)
DT2 <- data.table(z=2, y=1)
# returns error when fill=FALSE but can't be bound without fill=TRUE
rbindlist(list(DT1, DT2), use.names=TRUE, fill=FALSE)
# Error in rbindlist(list(DT1, DT2), use.names = TRUE, fill = FALSE) :
# Answer requires 3 columns whereas one or more item(s) in the input
# list has only 2 columns. ...
向后兼容性( use.names = FALSE
, fill = FALSE
):
DT1 <- data.table(x=1, y=2)
DT2 <- data.table(y=1, x=2)
rbindlist(list(DT1, DT2))
# x y
# 1: 1 2
# 2: 1 2
HTH
这篇关于rbindlist data.tables具有不同的列数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文