合并多个数据框 - match.names(clas, names(xi)) 中的错误:名称与以前的名称不匹配 [英] Merge multiple data frames - Error in match.names(clabs, names(xi)) : names do not match previous names

查看:13
本文介绍了合并多个数据框 - match.names(clas, names(xi)) 中的错误:名称与以前的名称不匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在尝试合并多个数据框时遇到了一些非常奇怪的东西.帮助!

I'm getting some really bizarre stuff while trying to merge multiple data frames. Help!

我需要按RID"和VISCODE"列合并一堆数据框.这是它的外观示例:

I need to merge a bunch of data frames by the columns 'RID' and 'VISCODE'. Here is an example of what it looks like:

d1 = data.frame(ID = sample(9, 1:100), RID = c(2, 5, 7, 9, 12),
            VISCODE = rep('bl', 5),
            value1 = rep(16, 5))

d2 = data.frame(ID = sample(9, 1:100), RID = c(2, 2, 2, 5, 5, 5, 7, 7, 7),
            VISCODE = rep(c('bl', 'm06', 'm12'), 3),
            value2 = rep(100, 9))

d3 = data.frame(ID = sample(9, 1:100), RID = c(2, 2, 2, 5, 5, 5, 9,9,9),
            VISCODE = rep(c('bl', 'm06', 'm12'), 3),
            value3 = rep("a", 9),
            values3.5 = rep("c", 9))

d4 = data.frame(ID =sample(8, 1:100), RID = c(2, 2, 5, 5, 5, 7, 7, 7, 9),
            VISCODE = c(c('bl', 'm12'), rep(c('bl', 'm06', 'm12'), 2), 'bl'),
            value4 = rep("b", 9))

dataList = list(d1, d2, d3, d4)

我查看了题为 使用循环将多个 data.frames 合并到一个 data.frame 中." 我使用了那里建议的 reduce 方法以及我写的一个循环:

I looked at the answers to the question titled "Merge several data.frames into one data.frame with a loop." I used the reduce method suggested there as well as a loop I wrote:

try1 = mymerge(dataList)

try2 <- Reduce(function(x, y) merge(x, y, all= TRUE,
by=c("RID", "VISCODE")), dataList, accumulate=F)

其中 dataList 是数据框列表,mymerge 是:

where dataList is a list of data frames and mymerge is:

mymerge = function(dataList){

L = length(dataList)

mdat = dataList[[1]]

  for(i in 2:L){

    mdat = merge(mdat, dataList[[i]], by.x = c("RID", "VISCODE"),
                                  by.y = c("RID", "VISCODE"), all = TRUE)
  }

mdat
}

对于我的测试数据和真实数据的子集,这两种方法都可以正常工作并产生完全相同的结果.但是,当我使用更大的数据子集时,它们都会崩溃并给我以下错误: match.names(clas, names(xi)) 中的错误:名称与以前的名称不匹配.

For my test data and subsets of my real data, both of these work fine and produce exactly the same results. However, when I use larger subsets of my data, they both break down and give me the following error: Error in match.names(clabs, names(xi)) : names do not match previous names.

真正奇怪的是使用这个方法:

The really weird thing is that using this works:

  dataList = list(demog[1:50,],
            neurobat[1:50,],
            apoe[1:50,],
            mmse[1:50,],
            faq[1:47, ])

使用这个失败:

  dataList = list(demog[1:50,],
            neurobat[1:50,],
            apoe[1:50,],
            mmse[1:50,],
            faq[1:48, ])

据我所知,常见问题的第 48 行没有什么特别之处.同样,使用这个作品:

As far as I can tell, there is nothing special about row 48 of faq. Likewise, using this works:

dataList = list(demog[1:50,],
            neurobat[1:50,],
            apoe[1:50,],
            mmse[1:50,],
            pdx[1:47, ])

使用这个失败:

dataList = list(demog[1:50,],
            neurobat[1:50,],
            apoe[1:50,],
            mmse[1:50,],
            pdx[1:48, ])

faq 中的第 48 行和 pdx 中的第 48 行具有相同的 RID 和 VISCODE 值,相同的 EXAMDATE 值(我不匹配的东西)和不同的 ID 值(我不匹配的另一件事).除了匹配的 RID 和 VISCODE 之外,我还看到了它们的任何特别之处.它们不共享任何其他变量名称.同样的情况在数据的其他地方发生,没有问题.

Row 48 in faq and row 48 in pdx have the same values for RID and VISCODE, the same value for EXAMDATE (something I'm not matching on) and different values for ID (another thing I'm not matching on). Besides the matching RID and VISCODE, I see anything special about them. They don't share any other variable names. This same scenario occurs elsewhere in the data without problems.

为了给复杂功能锦上添花,这甚至不起作用:

To add icing on the complication cake, this doesn't even work:

dataList = list(demog[1:50,],
            neurobat[1:50,],
            apoe[1:50,],
            mmse[1:50,],
            faq[1:48, 2:3])

其中第 2 列和第 3 列是RID"和VISCODE".

where columns 2 and 3 are "RID" and "VISCODE".

48 甚至不是神奇的数字,因为它有效:

48 isn't even the magic number because this works:

 dataList = list(demog[1:500,],
            neurobat[1:500,],
            apoe[1:500,],
            mmse[1:457,])

使用 mmse[1:458, ] 失败.

while using mmse[1:458, ] fails.

我似乎无法想出导致问题的测试数据.以前有人遇到过这个问题吗?关于如何合并有更好的想法吗?

I can't seem to come up with test data that causes the problem. Has anyone had this problem before? Any better ideas on how to merge?

推荐答案

不幸的是我不确定我能提供帮助,但我想我会发帖,因为我发现这个是为了寻求有关此错误的帮助.我有效地拥有的是:

Not sure I can help unfortunately but thought I would post as I found this searching for help on this error. What I effectively had was:

a <- cbind(b,c)
d <- merge(a,e)

我也遇到了同样的错误.使用 a <- data.frame(b,c) 解决了这个问题,但我不知道为什么.

And I got that same error. Using a <- data.frame(b,c) fixed the problem, but I can't work out why.

object.size(a);1248124200 bytes

object.size(c);1248124032 bytes

所以有些不同.所有的类都是一样的,str() 什么也没透露.我被难住了.

So something is different. All classes are the same, str() reveals nothing. I'm stumped.

希望能帮助其他知情者.

Hopefully that aids someone else in the know.

这篇关于合并多个数据框 - match.names(clas, names(xi)) 中的错误:名称与以前的名称不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆