pandas 加入/合并/合并两个数据框 [英] Pandas join/merge/concat two dataframes
问题描述
我在加入熊猫方面遇到问题,并且试图找出问题所在.
假设我有一个dataframe
x:
I am having issues with joins in pandas and I am trying to figure out what is wrong.
Say I have a dataframe
x:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1941 entries, 2004-10-19 00:00:00 to 2012-07-23 00:00:00
Data columns:
close 1941 non-null values
high 1941 non-null values
low 1941 non-null values
open 1941 non-null values
dtypes: float64(4)
我应该能够通过一个简单的连接命令将y与x一起加入y,其中y = x,除了同名具有+2.
should I be able to join it with y on index with a simple join command where y = x except colnames have +2.
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1941 entries, 2004-10-19 00:00:00 to 2012-07-23 00:00:00
Data columns:
close2 1941 non-null values
high2 1941 non-null values
low2 1941 non-null values
open2 1941 non-null values
dtypes: float64(4)
y.join(x) or pandas.DataFrame.join(y,x):
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 34879 entries, 2004-12-16 00:00:00 to 2012-07-12 00:00:00
Data columns:
close2 34879 non-null values
high2 34879 non-null values
low2 34879 non-null values
open2 34879 non-null values
close 34879 non-null values
high 34879 non-null values
low 34879 non-null values
open 34879 non-null values
dtypes: float64(8)
我希望决赛都将有1941个非值.我也尝试过合并,但是我遇到了同样的问题.
I expect the final to have 1941 non-values for both. I tried merge as well but I have the same issue.
我原以为正确的答案是pandas.concat([x,y]),但这也不符合我的意图.
I had thought the right answer was pandas.concat([x,y]), but this does not do what I intend either.
In [83]: pandas.concat([x,y])
Out[83]: <class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 3882 entries, 2004-10-19 00:00:00 to 2012-07-23 00:00:00
Data columns:
close2 3882 non-null values
high2 3882 non-null values
low2 3882 non-null values
open2 3882 non-null values
dtypes: float64(4)
如果您在加入时遇到问题,请阅读下面的韦斯答案.我有一个重复的时间戳.
edit: If you are having issues with join, read Wes's answer below. I had one time stamp that was duplicated.
推荐答案
您的索引是否有重复的x.index.is_unique
?如果是这样,则可以解释您所看到的行为:
Does your index have duplicates x.index.is_unique
? If so would explain the behavior you're seeing:
In [16]: left
Out[16]:
a
2000-01-01 1
2000-01-01 1
2000-01-01 1
2000-01-02 2
2000-01-02 2
2000-01-02 2
In [17]: right
Out[17]:
b
2000-01-01 3
2000-01-01 3
2000-01-01 3
2000-01-02 4
2000-01-02 4
2000-01-02 4
In [18]: left.join(right)
Out[18]:
a b
2000-01-01 1 3
2000-01-01 1 3
2000-01-01 1 3
2000-01-01 1 3
2000-01-01 1 3
2000-01-01 1 3
2000-01-01 1 3
2000-01-01 1 3
2000-01-01 1 3
2000-01-02 2 4
2000-01-02 2 4
2000-01-02 2 4
2000-01-02 2 4
2000-01-02 2 4
2000-01-02 2 4
2000-01-02 2 4
2000-01-02 2 4
2000-01-02 2 4
这篇关于 pandas 加入/合并/合并两个数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!