在 pandas 数据框中匹配和重新排列值 [英] Match and Rearrange values in pandas dataframe

查看:59
本文介绍了在 pandas 数据框中匹配和重新排列值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个如下数据框:

    A Country  price1 A Country  price2 B Country  price1 B Country  price2 C Country  price1
0   19-12-04   0.0    19-12-05   1.7    19-12-05   2.6    19-12-06   3.2    19-12-05   0.1
1   19-12-03   1.5    19-12-04   1.7    19-12-04   2.6    19-12-05   3.2    19-12-04   0.1 
2   19-12-02   1.5    19-12-03   1.7    19-12-03   2.6    19-12-04   3.1    19-12-03   0.1
3   19-12-01   1.5    19-12-02   1.8    19-12-02   2.7    19-12-03   3.2    19-12-02   0.1
4   19-11-29   1.5    19-12-01   1.7    19-11-29   2.6    19-12-02   3.2    19-12-01   0.1
5   19-11-28   1.6    19-11-29   1.7    19-11-28   2.6    19-11-29   3.1    19-11-29   0.1
6   19-11-27   1.6    19-11-28   1.7    19-11-27   2.6    19-11-28   3.2    19-11-28   0.1
7   19-11-26   1.6    19-11-27   1.7    19-11-26   2.6    19-11-27   3.2    19-11-27   0.2
8   19-11-25   1.5    19-11-26   1.7    19-11-25   2.6    19-11-26   3.2    19-11-26   0.2
9   19-11-24   1.5    19-11-25   1.7    19-11-22   2.6    19-11-25   3.2    19-11-25   0.2
10  19-11-22   1.5    19-11-24   1.7    19-11-21   2.6    19-11-22   3.1    19-11-24   0.2

每个国家/地区列具有不同的行值。
现在,我想按日期匹配和重新排列值。我想用?代替空白。标记。我想要的结果如下:

Each Country columns has different row values. Now, I want to match and rearrange values by date. And I want to replace the blank by "?" mark. The result that I want is like below:

    A Country  price1 A Country  price2 B Country  price1 B Country  price2 C Country  price1
0   19-12-06   ?      19-12-06   ?      19-12-06   ?      19-12-06   3.2    19-12-06   ?  
1   19-12-05   ?      19-12-05   1.7    19-12-05   2.6    19-12-05   3.2    19-12-05   0.1
2   19-12-04   0.0    19-12-04   1.7    19-12-04   2.6    19-12-04   3.1    19-12-04   0.1
3   19-12-03   1.5    19-12-03   1.7    19-12-03   2.6    19-12-03   3.2    19-12-03   0.1 
4   19-12-02   1.5    19-12-02   1.8    19-12-02   2.7    19-12-02   3.2    19-12-02   0.1
5   19-12-01   1.5    19-12-01   1.7    19-12-01   ?      19-12-01   ?      19-12-01   0.1
6   19-11-29   1.5    19-11-29   1.7    19-11-29   2.6    19-11-29   3.1    19-11-29   0.1
7   19-11-28   1.6    19-11-28   1.7    19-11-28   2.6    19-11-28   3.2    19-11-28   0.1
8   19-11-27   1.6    19-11-27   1.7    19-11-27   2.6    19-11-27   3.2    19-11-27   0.2
9   19-11-26   1.6    19-11-26   1.7    19-11-26   2.6    19-11-26   3.2    19-11-26   0.2
10  19-11-25   1.5    19-11-25   1.7    19-11-25   2.6    19-11-25   3.2    19-11-25   0.2
11  19-11-24   1.5    19-11-24   1.7    19-11-24   ?      19-11-24   ?      19-11-24   0.2
12  19-11-23   ?      19-11-23   ?      19-11-23   ?      19-11-23   ?      19-11-23   ?
13  19-11-22   1.5    19-11-22   ?      19-11-22   2.6    19-11-22   3.1    19-11-22   ?
14  19-11-21   ?      19-11-21   ?      19-11-21   2.6    19-11-21   ?      19-11-21   ?

对不起,我是编码方面的新手。列名对我来说并不重要,
所以,我想要的替代结果是:

Sorry, I am a complete novice at coding. Column name is not important to me, So, Alternative result I want is:

    A Country  price1 price2 price1 price2 price1
0   19-12-06   ?      ?      ?      3.2    ?  
1   19-12-05   ?      1.7    2.6    3.2    0.1
2   19-12-04   0.0    1.7    2.6    3.1    0.1
3   19-12-03   1.5    1.7    2.6    3.2    0.1 
4   19-12-02   1.5    1.8    2.7    3.2    0.1
5   19-12-01   1.5    1.7    ?      ?      0.1
6   19-11-29   1.5    1.7    2.6    3.1    0.1
7   19-11-28   1.6    1.7    2.6    3.2    0.1
8   19-11-27   1.6    1.7    2.6    3.2    0.2
9   19-11-26   1.6    1.7    2.6    3.2    0.2
10  19-11-25   1.5    1.7    2.6    3.2    0.2
11  19-11-24   1.5    1.7    ?      ?      0.2
12  19-11-23   ?      ?      ?      ?      ?
13  19-11-22   1.5    ?      2.6    3.1    ?
14  19-11-21   ?      ?      2.6    ?      ?

如何实现?

推荐答案

想法是对每个成对和不成对的列进行压缩,并在列表理解中按第一列创建索引,最后按 concat 加入并创建 DatetimeIndex

Idea is zip each pair and unpair columns and in list comprehension create index by first column, last join together by concat and create DatetimeIndex

a = df.columns[::2]
b = df.columns[1::2]
dfs = [df.loc[:, x].set_index(x[0], drop=False)[x[1]] for x in zip(a, b)]
df = pd.concat(dfs, axis=1, sort=False).fillna('?')
df.index = pd.to_datetime(df.index,format='%y-%m-%d')
df = df.sort_index()
print (df)

           price1 price2 price1.1 price2.1 price1.2
2019-11-21      ?      ?      2.6        ?        ?
2019-11-22    1.5      ?      2.6      3.1        ?
2019-11-24    1.5    1.7        ?        ?      0.2
2019-11-25    1.5    1.7      2.6      3.2      0.2
2019-11-26    1.6    1.7      2.6      3.2      0.2
2019-11-27    1.6    1.7      2.6      3.2      0.2
2019-11-28    1.6    1.7      2.6      3.2      0.1
2019-11-29    1.5    1.7      2.6      3.1      0.1
2019-12-01    1.5    1.7        ?        ?      0.1
2019-12-02    1.5    1.8      2.7      3.2      0.1
2019-12-03    1.5    1.7      2.6      3.2      0.1
2019-12-04      0    1.7      2.6      3.1      0.1
2019-12-05      ?    1.7      2.6      3.2      0.1
2019-12-06      ?      ?        ?      3.2        ?

这篇关于在 pandas 数据框中匹配和重新排列值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆