合并(合并)数据帧-结果中有太多行 [英] Merge (join) data frames - too many rows in result

查看:74
本文介绍了合并(合并)数据帧-结果中有太多行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据帧(df1和df2).我想使用合并功能加入他们.

I have two data frames(df1 and df2). I want to join them using merge function.

df1有3903行,df2有351行.

df1 has 3903 rows and df2 has 351 rows.

我想通过一个公共列(column1)将df2与df1连接起来.我正在使用合并功能.

I want to left join df2 to df1 by a common column(column1). I am using merge function.

我的代码如下:

dfjoin<-merge(df1,df2, by="column1",all.x=TRUE)

因此,我希望dfjoin的3903行等于df1的行.但是,它返回4010行.

So I expect dfjoin has 3903 rows equal to rows of df1. However it returns 4010 rows.

为什么它返回的行比预期的多.我会很高兴为您提供任何帮助.非常感谢.

Why does it return more rows than expected. I will be very glad for any help. Thanks a lot.

推荐答案

这可能是因为df2的column1中的值不是1-1映射.意味着column1中的单个值可能与column2中的多个值有关.您可以使用table(df2$column1)进行检查.如果您从column1中找到一个计数> 1的值,那么这就是原因.

This may be because the values in column1 from df2 are not a 1-1 mapping. Meaning a single value in column1 may be related to more than one value in column2. You can check this by using table(df2$column1). If you find a value from column1 with a count > 1 then this is the reason.

此外,如果您对sql比较熟悉,我想推荐一个替代方法,这里有一个很好的库叫做sqldf,它允许您在数据帧上使用类似sql的查询!

Also I would like to recommend an alternative if you are more comfortable with sql there is a very nice library called sqldf which allows you to use sql like queries on your data frames!

这篇关于合并(合并)数据帧-结果中有太多行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆