合并多个数据帧-match.names(clabs,names(xi))中的错误:名称与以前的名称不匹配 [英] Merge multiple data frames - Error in match.names(clabs, names(xi)) : names do not match previous names

查看:164
本文介绍了合并多个数据帧-match.names(clabs,names(xi))中的错误:名称与以前的名称不匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在尝试合并多个数据框时,我得到了一些非常奇怪的东西.救命!

I'm getting some really bizarre stuff while trying to merge multiple data frames. Help!

我需要通过"RID"和"VISCODE"列合并一堆数据帧.这是一个看起来像的例子:

I need to merge a bunch of data frames by the columns 'RID' and 'VISCODE'. Here is an example of what it looks like:

d1 = data.frame(ID = sample(9, 1:100), RID = c(2, 5, 7, 9, 12),
            VISCODE = rep('bl', 5),
            value1 = rep(16, 5))

d2 = data.frame(ID = sample(9, 1:100), RID = c(2, 2, 2, 5, 5, 5, 7, 7, 7),
            VISCODE = rep(c('bl', 'm06', 'm12'), 3),
            value2 = rep(100, 9))

d3 = data.frame(ID = sample(9, 1:100), RID = c(2, 2, 2, 5, 5, 5, 9,9,9),
            VISCODE = rep(c('bl', 'm06', 'm12'), 3),
            value3 = rep("a", 9),
            values3.5 = rep("c", 9))

d4 = data.frame(ID =sample(8, 1:100), RID = c(2, 2, 5, 5, 5, 7, 7, 7, 9),
            VISCODE = c(c('bl', 'm12'), rep(c('bl', 'm06', 'm12'), 2), 'bl'),
            value4 = rep("b", 9))

dataList = list(d1, d2, d3, d4)

我查看了标题为我使用了此处建议的reduce方法以及我编写的循环:

I looked at the answers to the question titled "Merge several data.frames into one data.frame with a loop." I used the reduce method suggested there as well as a loop I wrote:

try1 = mymerge(dataList)

try2 <- Reduce(function(x, y) merge(x, y, all= TRUE,
by=c("RID", "VISCODE")), dataList, accumulate=F)

其中dataList是数据帧的列表,而mymerge是:

where dataList is a list of data frames and mymerge is:

mymerge = function(dataList){

L = length(dataList)

mdat = dataList[[1]]

  for(i in 2:L){

    mdat = merge(mdat, dataList[[i]], by.x = c("RID", "VISCODE"),
                                  by.y = c("RID", "VISCODE"), all = TRUE)
  }

mdat
}

对于我的测试数据和真实数据的子集,这两种方法都可以正常工作并产生完全相同的结果.但是,当我使用较大的数据子集时,它们都会分解并给我以下错误:match.names(clabs,names(xi))中的错误:名称与以前的名称不匹配.

For my test data and subsets of my real data, both of these work fine and produce exactly the same results. However, when I use larger subsets of my data, they both break down and give me the following error: Error in match.names(clabs, names(xi)) : names do not match previous names.

真正奇怪的是,使用此功能有效:

The really weird thing is that using this works:

  dataList = list(demog[1:50,],
            neurobat[1:50,],
            apoe[1:50,],
            mmse[1:50,],
            faq[1:47, ])

使用此方法失败:

  dataList = list(demog[1:50,],
            neurobat[1:50,],
            apoe[1:50,],
            mmse[1:50,],
            faq[1:48, ])

据我所知,常见问题解答的第48行没有什么特别的.同样,使用此功能:

As far as I can tell, there is nothing special about row 48 of faq. Likewise, using this works:

dataList = list(demog[1:50,],
            neurobat[1:50,],
            apoe[1:50,],
            mmse[1:50,],
            pdx[1:47, ])

使用此方法失败:

dataList = list(demog[1:50,],
            neurobat[1:50,],
            apoe[1:50,],
            mmse[1:50,],
            pdx[1:48, ])

常见问题解答中的行48和pdx中的行48具有相同的RID和VISCODE值,具有相同的EXAMDATE值(我不匹配)和具有不同的ID值(另一项我不匹配) .除了匹配的RID和VISCODE,我还看到了一些特别的东西.它们不共享任何其他变量名.在数据中的其他地方也不会出现任何问题,这种情况相同.

Row 48 in faq and row 48 in pdx have the same values for RID and VISCODE, the same value for EXAMDATE (something I'm not matching on) and different values for ID (another thing I'm not matching on). Besides the matching RID and VISCODE, I see anything special about them. They don't share any other variable names. This same scenario occurs elsewhere in the data without problems.

要在复杂的蛋糕上加糖霜,这甚至不起作用:

To add icing on the complication cake, this doesn't even work:

dataList = list(demog[1:50,],
            neurobat[1:50,],
            apoe[1:50,],
            mmse[1:50,],
            faq[1:48, 2:3])

其中第2列和第3列是"RID"和"VISCODE".

where columns 2 and 3 are "RID" and "VISCODE".

48甚至不是幻数,因为它可以工作:

48 isn't even the magic number because this works:

 dataList = list(demog[1:500,],
            neurobat[1:500,],
            apoe[1:500,],
            mmse[1:457,])

在使用mmse [1:458,]时失败.

while using mmse[1:458, ] fails.

我似乎无法提供导致问题的测试数据.有人遇到过这个问题吗?关于合并的更好的主意吗?

I can't seem to come up with test data that causes the problem. Has anyone had this problem before? Any better ideas on how to merge?

推荐答案

不确定是否可以提供帮助,但是我以为我会发帖,因为我发现此错误寻求帮助.我实际上拥有的是:

Not sure I can help unfortunately but thought I would post as I found this searching for help on this error. What I effectively had was:

a <- cbind(b,c)
d <- merge(a,e)

我也遇到同样的错误.使用a <- data.frame(b,c)可以解决此问题,但是我无法找出原因.

And I got that same error. Using a <- data.frame(b,c) fixed the problem, but I can't work out why.

object.size(a);1248124200 bytes

object.size(c);1248124032 bytes

所以有些不同.所有类均相同,str()不显示任何内容.我很困惑.

So something is different. All classes are the same, str() reveals nothing. I'm stumped.

希望能帮助其他人.

这篇关于合并多个数据帧-match.names(clabs,names(xi))中的错误:名称与以前的名称不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆