通过比较python中的行值将值从一个数据帧复制到另一数据帧(不同长度) [英] copy values from one dataframe to another dataframe(different length) by comparing row values in python

查看:64
本文介绍了通过比较python中的行值将值从一个数据帧复制到另一数据帧(不同长度)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是python的新手,正在使用数据框。我有两个数据框,一个包含几个月的数据,另一个包含那几个月的几天的数据。我希望将每月数据框中的数据作为每日数据框中的一列,重复该月中的天数。谢谢,我尝试提供以下插图。

I am new to python and working with dataframes. I have two dataframes, one with data for months and another with data for days in those months. I want the data from the monthly dataframe as a column in the daily dataframe, repeated for the number of days in that month. Thanks, I have tried to Provide with an illustration below.

Monthly DF
    Val  Date        Year   Month
0   0.00 2016-01-31  2016   1
1   0.10 2016-02-29  2016   2
2   0.07 2016-03-31  2016   3
3   0.01 2016-04-30  2016   4
4   0.28 2016-05-31  2016   5

DailyDF
         Date  Year  Month  Val
0  2016-01-01  2016      1    0
1  2016-01-02  2016      1    0
2  2016-01-03  2016      1    0
3  2016-01-04  2016      1    0
4  2016-01-05  2016      1    0
5  2016-01-06  2016      1    0
6  2016-01-07  2016      1    0
7  2016-01-08  2016      1    0
8  2016-01-09  2016      1    0
9  2016-01-10  2016      1    0
10 2016-01-11  2016      1    0
11 2016-01-12  2016      1    0
12 2016-01-13  2016      1    0
13 2016-01-14  2016      1    0
14 2016-01-15  2016      1    0
15 2016-01-16  2016      1    0
16 2016-01-17  2016      1    0
17 2016-01-18  2016      1    0
18 2016-01-19  2016      1    0
19 2016-01-20  2016      1    0
20 2016-01-21  2016      1    0
21 2016-01-22  2016      1    0
22 2016-01-23  2016      1    0
23 2016-01-24  2016      1    0
24 2016-01-25  2016      1    0
25 2016-01-26  2016      1    0
26 2016-01-27  2016      1    0
27 2016-01-28  2016      1    0
28 2016-01-29  2016      1    0
29 2016-01-30  2016      1    0
..        ...   ...    ...  ...
31 2016-02-01  2016      2    0
32 2016-02-02  2016      2    0
33 2016-02-03  2016      2    0
34 2016-02-04  2016      2    0
35 2016-02-05  2016      2    0
36 2016-02-06  2016      2    0
37 2016-02-07  2016      2    0
38 2016-02-08  2016      2    0
39 2016-02-09  2016      2    0
40 2016-02-10  2016      2    0
41 2016-02-11  2016      2    0
42 2016-02-12  2016      2    0
43 2016-02-13  2016      2    0
44 2016-02-14  2016      2    0
45 2016-02-15  2016      2    0
46 2016-02-16  2016      2    0
47 2016-02-17  2016      2    0
48 2016-02-18  2016      2    0
49 2016-02-19  2016      2    0
50 2016-02-20  2016      2    0
51 2016-02-21  2016      2    0
52 2016-02-22  2016      2    0
53 2016-02-23  2016      2    0
54 2016-02-24  2016      2    0
55 2016-02-25  2016      2    0
56 2016-02-26  2016      2    0
57 2016-02-27  2016      2    0
58 2016-02-28  2016      2    0
59 2016-02-29  2016      2    0
60 2016-03-01  2016      3    0

所以在每日数据框的 Val列我希望在该月的天数中重复每月数据框的 Val。

So in the 'Val' column of Daily Dataframe I want the "Val" from the Monthly Dataframe to be repeated for the number of days in that month.

Expected Output
         Date  Year  Month  Val
0  2016-01-01  2016      1    0
1  2016-01-02  2016      1    0
2  2016-01-03  2016      1    0
3  2016-01-04  2016      1    0
4  2016-01-05  2016      1    0
5  2016-01-06  2016      1    0
6  2016-01-07  2016      1    0
7  2016-01-08  2016      1    0
8  2016-01-09  2016      1    0
..        ...   ...    ...  ...

10 2016-01-11  2016      1    0
11 2016-01-12  2016      1    0
12 2016-01-13  2016      1    0
13 2016-01-14  2016      1    0
14 2016-01-15  2016      1    0
15 2016-01-16  2016      1    0
16 2016-01-17  2016      1    0
17 2016-01-18  2016      1    0
18 2016-01-19  2016      1    0
19 2016-01-20  2016      1    0
20 2016-01-21  2016      1    0
21 2016-01-22  2016      1    0
22 2016-01-23  2016      1    0
23 2016-01-24  2016      1    0
24 2016-01-25  2016      1    0
25 2016-01-26  2016      1    0
26 2016-01-27  2016      1    0
27 2016-01-28  2016      1    0
28 2016-01-29  2016      1    0
29 2016-01-30  2016      1    0
..        ...   ...    ...  ...

41 2016-02-11  2016      2    0.10
42 2016-02-12  2016      2    0.10
43 2016-02-13  2016      2    0.10
44 2016-02-14  2016      2    0.10
45 2016-02-15  2016      2    0.10
46 2016-02-16  2016      2    0.10
47 2016-02-17  2016      2    0.10
..        ...   ...    ...  ...
49 2016-03-19  2016      3    0.07
50 2016-03-20  2016      3    0.07
51 2016-03-21  2016      3    0.07
52 2016-03-22  2016      3    0.07
53 2016-03-23  2016      3    0.07
54 2016-03-24  2016      3    0.07


推荐答案

@Merlin已经提到加入(使用 pd.merge()方法)应该非常简单:

As @Merlin has already mentioned joining (using pd.merge() method) should be pretty straightforward:

In [126]: pd.merge(daily.drop('Val', 1), monthly.drop('Date', 1), on=['Year','Month'])
Out[126]:
          Date  Year  Month  val   Val
0   2016-01-01  2016      1    0  0.00
1   2016-01-02  2016      1    0  0.00
2   2016-01-03  2016      1    0  0.00
3   2016-01-04  2016      1    0  0.00
4   2016-01-05  2016      1    0  0.00
5   2016-01-06  2016      1    0  0.00
6   2016-01-07  2016      1    0  0.00
7   2016-01-08  2016      1    0  0.00
8   2016-01-09  2016      1    0  0.00
9   2016-01-10  2016      1    0  0.00
10  2016-01-11  2016      1    0  0.00
11  2016-01-12  2016      1    0  0.00
12  2016-01-13  2016      1    0  0.00
13  2016-01-14  2016      1    0  0.00
14  2016-01-15  2016      1    0  0.00
..         ...   ...    ...  ...   ...
137 2016-05-17  2016      5    0  0.28
138 2016-05-18  2016      5    0  0.28
139 2016-05-19  2016      5    0  0.28
140 2016-05-20  2016      5    0  0.28
141 2016-05-21  2016      5    0  0.28
142 2016-05-22  2016      5    0  0.28
143 2016-05-23  2016      5    0  0.28
144 2016-05-24  2016      5    0  0.28
145 2016-05-25  2016      5    0  0.28
146 2016-05-26  2016      5    0  0.28
147 2016-05-27  2016      5    0  0.28
148 2016-05-28  2016      5    0  0.28
149 2016-05-29  2016      5    0  0.28
150 2016-05-30  2016      5    0  0.28
151 2016-05-31  2016      5    0  0.28

[152 rows x 5 columns]

我想为您提供更具挑战性的任务-生成您想要的DF仅来自每月DF:

In [108]: df
Out[108]:
    Val       Date  Year  Month
0  0.00 2016-01-31  2016      1
1  0.10 2016-02-28  2016      2
2  0.07 2016-03-31  2016      3
3  0.01 2016-04-30  2016      4
4  0.28 2016-05-31  2016      5

In [117]: df.set_index('Date').resample('MS').mean().append(x.iloc[[-1]]).resample('D').pad().reset_index()
Out[117]:
          Date   Val  Year  Month
0   2016-01-01  0.00  2016      1
1   2016-01-02  0.00  2016      1
2   2016-01-03  0.00  2016      1
3   2016-01-04  0.00  2016      1
4   2016-01-05  0.00  2016      1
5   2016-01-06  0.00  2016      1
6   2016-01-07  0.00  2016      1
7   2016-01-08  0.00  2016      1
8   2016-01-09  0.00  2016      1
9   2016-01-10  0.00  2016      1
10  2016-01-11  0.00  2016      1
11  2016-01-12  0.00  2016      1
12  2016-01-13  0.00  2016      1
13  2016-01-14  0.00  2016      1
14  2016-01-15  0.00  2016      1
..         ...   ...   ...    ...
137 2016-05-17  0.28  2016      5
138 2016-05-18  0.28  2016      5
139 2016-05-19  0.28  2016      5
140 2016-05-20  0.28  2016      5
141 2016-05-21  0.28  2016      5
142 2016-05-22  0.28  2016      5
143 2016-05-23  0.28  2016      5
144 2016-05-24  0.28  2016      5
145 2016-05-25  0.28  2016      5
146 2016-05-26  0.28  2016      5
147 2016-05-27  0.28  2016      5
148 2016-05-28  0.28  2016      5
149 2016-05-29  0.28  2016      5
150 2016-05-30  0.28  2016      5
151 2016-05-31  0.28  2016      5

[152 rows x 4 columns]

说明:

重新采样 MonthlyDF 到月初

In [112]: df.set_index('Date').resample('MS').mean()
Out[112]:
             Val  Year  Month
Date
2016-01-01  0.00  2016      1
2016-02-01  0.10  2016      2
2016-03-01  0.07  2016      3
2016-04-01  0.01  2016      4
2016-05-01  0.28  2016      5

从原始 MonthlyDF

In [113]: df.set_index('Date').resample('MS').mean().append(x.iloc[[-1]])
Out[113]:
             Val  Year  Month
Date
2016-01-01  0.00  2016      1
2016-02-01  0.10  2016      2
2016-03-01  0.07  2016      3
2016-04-01  0.01  2016      4
2016-05-01  0.28  2016      5
2016-05-31  0.28  2016      5

之后,我们可以轻松地重新采样它使用每天规则: D

after that we can easily resample it using daily rule: D

这篇关于通过比较python中的行值将值从一个数据帧复制到另一数据帧(不同长度)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆