没有名称的rbind data.frames [英] rbind data.frames without names
问题描述
我试图找出当加入没有名称的data.frames时,为什么 rbind
功能不能正常工作。
这是我的测试:
I am trying to figure out why the rbind
function is not working as intended when joining data.frames without names.
Here is my testing:
test <- data.frame(
id=rep(c("a","b"),each=3),
time=rep(1:3,2),
black=1:6,
white=1:6,
stringsAsFactors=FALSE
)
# take some subsets with different names
pt1 <- test[,c(1,2,3)]
pt2 <- test[,c(1,2,4)]
# method 1 - rename to same names - works
names(pt2) <- names(pt1)
rbind(pt1,pt2)
# method 2 - works - even with duplicate names
names(pt1) <- letters[c(1,1,1)]
names(pt2) <- letters[c(1,1,1)]
rbind(pt1,pt2)
# method 3 - works - with a vector of NA's as names
names(pt1) <- rep(NA,ncol(pt1))
names(pt2) <- rep(NA,ncol(pt2))
rbind(pt1,pt2)
# method 4 - but... does not work without names at all?
pt1 <- unname(pt1)
pt2 <- unname(pt2)
rbind(pt1,pt2)
这对我来说似乎有点奇怪。我错过了一个很好的理由,为什么这不应该开箱即用?
This seems a bit odd to me. Am I missing a good reason why this shouldn't work out of the box?
修改附加信息
使用@ JoshO'Brien的建议 debug
,我可以将错误发现在此中,如果
声明部分 rbind.data.frame
函数
Using @JoshO'Brien's suggestion to debug
, I can identify the error as occurring during this if
statement part of the rbind.data.frame
function
if (is.null(pi) || is.na(jj <- pi[[j]]))
(在线代码这里: http://svn.r-project.org/R/trunk/src/library/base/R/dataframe.R 起始于: ###这里是rbind和cbind的方法
(online version of code here: http://svn.r-project.org/R/trunk/src/library/base/R/dataframe.R starting at: "### Here are the methods for rbind and cbind.")
从程序开始, pi
的值似乎没有在这一点上被设置,因此程序试图索引内置的常量 pi
像 pi [[3]]
和错误出来。
From stepping through the program, the value of pi
does not appear to have been set at this point, hence the program tries to index the built-in constant pi
like pi[[3]]
and errors out.
从我可以看出,内部的 pi
对象似乎没有设置,因为这个较早的行,其中 clabs
已初始化为 NULL
:
From what I can figure, the internal pi
object doesn't appear to be set due to this earlier line where clabs
has been initialized as NULL
:
if (is.null(clabs)) clabs <- names(xi) else { #pi gets set here
我正在纠结一下试图弄清楚这一点,但会更新为一起来。
I am in a tangle trying to figure this out, but will update as it comes together.
推荐答案
因为 unname()
显式地将NA分配为列标题不是相同的动作。当列名称全部为NA时,则可以使用 rbind()
。由于 rbind()
取数据框的名称/列名,结果与&因此 rbind()
失败。
Because unname()
& explicitly assigning NA as column headers are not identical actions. When the column names are all NA, then an rbind()
is possible. Since rbind()
takes the names/colnames of the data frame, the results do not match & hence rbind()
fails.
这是一些代码,以帮助看到我的意思:
Here is some code to help see what I mean:
> c1 <- c(1,2,3)
> c2 <- c('A','B','C')
> df1 <- data.frame(c1,c2)
> df1
c1 c2
1 1 A
2 2 B
3 3 C
> df2 <- data.frame(c1,c2) # df1 & df2 are identical
>
> #Let's perform unname on one data frame &
> #replacement with NA on the other
>
> unname(df1)
NA NA
1 1 A
2 2 B
3 3 C
> tem1 <- names(unname(df1))
> tem1
NULL
>
> #Please note above that the column headers though showing as NA are null
>
> names(df2) <- rep(NA,ncol(df2))
> df2
NA NA
1 1 A
2 2 B
3 3 C
> tem2 <- names(df2)
> tem2
[1] NA NA
>
> #Though unname(df1) & df2 look identical, they aren't
> #Also note difference in tem1 & tem2
>
> identical(unname(df1),df2)
[1] FALSE
>
我希望这有帮助。名称显示为每个NA,但两个操作是不同的。
因此,将其列标题替换为NA的2个数据帧可以是rbound,但是没有任何列标题的2个数据帧(使用 unname()
实现)不能。
I hope this helps. The names show up as NA each, but the two operations are different.
Hence, 2 data frames with their column headers replaced to NA can be "rbound" but 2 data frames without any column headers (achieved using unname()
) cannot.
这篇关于没有名称的rbind data.frames的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!