pandas 加入/合并/合并两个数据框 [英] Pandas join/merge/concat two dataframes

查看:72
本文介绍了 pandas 加入/合并/合并两个数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在加入熊猫方面遇到问题,并且试图找出问题所在. 假设我有一个dataframe x:

I am having issues with joins in pandas and I am trying to figure out what is wrong. Say I have a dataframe x:

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1941 entries, 2004-10-19 00:00:00 to 2012-07-23 00:00:00
Data columns:
close    1941  non-null values
high     1941  non-null values
low      1941  non-null values
open     1941  non-null values
dtypes: float64(4)

我应该能够通过一个简单的连接命令将y与x一起加入y,其中y = x,除了同名具有+2.

should I be able to join it with y on index with a simple join command where y = x except colnames have +2.

 <class 'pandas.core.frame.DataFrame'>
 DatetimeIndex: 1941 entries, 2004-10-19 00:00:00 to 2012-07-23 00:00:00
 Data columns:
 close2    1941  non-null values
 high2     1941  non-null values
 low2      1941  non-null values
 open2     1941  non-null values
 dtypes: float64(4)

 y.join(x) or pandas.DataFrame.join(y,x):
 <class 'pandas.core.frame.DataFrame'>
 DatetimeIndex: 34879 entries, 2004-12-16 00:00:00 to 2012-07-12 00:00:00
 Data columns:
 close2    34879  non-null values
 high2     34879  non-null values
 low2      34879  non-null values
 open2     34879  non-null values
 close     34879  non-null values
 high      34879  non-null values
 low       34879  non-null values
 open      34879  non-null values
 dtypes: float64(8)

我希望决赛都将有1941个非值.我也尝试过合并,但是我遇到了同样的问题.

I expect the final to have 1941 non-values for both. I tried merge as well but I have the same issue.

我原以为正确的答案是pandas.concat([x,y]),但这也不符合我的意图.

I had thought the right answer was pandas.concat([x,y]), but this does not do what I intend either.

In [83]: pandas.concat([x,y]) 
Out[83]: <class 'pandas.core.frame.DataFrame'> 
DatetimeIndex: 3882 entries, 2004-10-19 00:00:00 to 2012-07-23 00:00:00 
Data columns: 
close2 3882 non-null values 
high2 3882 non-null values 
low2 3882 non-null values 
open2 3882 non-null values 
dtypes: float64(4) 

如果您在加入时遇到问题,请阅读下面的韦斯答案.我有一个重复的时间戳.

edit: If you are having issues with join, read Wes's answer below. I had one time stamp that was duplicated.

推荐答案

您的索引是否有重复的x.index.is_unique?如果是这样,则可以解释您所看到的行为:

Does your index have duplicates x.index.is_unique? If so would explain the behavior you're seeing:

In [16]: left
Out[16]: 
            a
2000-01-01  1
2000-01-01  1
2000-01-01  1
2000-01-02  2
2000-01-02  2
2000-01-02  2

In [17]: right
Out[17]: 
            b
2000-01-01  3
2000-01-01  3
2000-01-01  3
2000-01-02  4
2000-01-02  4
2000-01-02  4

In [18]: left.join(right)
Out[18]: 
            a  b
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-01  1  3
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4
2000-01-02  2  4

这篇关于 pandas 加入/合并/合并两个数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆