通过比较python中的行值将值从一个数据帧复制到另一数据帧(不同长度) [英] copy values from one dataframe to another dataframe(different length) by comparing row values in python
问题描述
我是python的新手,正在使用数据框。我有两个数据框,一个包含几个月的数据,另一个包含那几个月的几天的数据。我希望将每月数据框中的数据作为每日数据框中的一列,重复该月中的天数。谢谢,我尝试提供以下插图。
I am new to python and working with dataframes. I have two dataframes, one with data for months and another with data for days in those months. I want the data from the monthly dataframe as a column in the daily dataframe, repeated for the number of days in that month. Thanks, I have tried to Provide with an illustration below.
Monthly DF
Val Date Year Month
0 0.00 2016-01-31 2016 1
1 0.10 2016-02-29 2016 2
2 0.07 2016-03-31 2016 3
3 0.01 2016-04-30 2016 4
4 0.28 2016-05-31 2016 5
DailyDF
Date Year Month Val
0 2016-01-01 2016 1 0
1 2016-01-02 2016 1 0
2 2016-01-03 2016 1 0
3 2016-01-04 2016 1 0
4 2016-01-05 2016 1 0
5 2016-01-06 2016 1 0
6 2016-01-07 2016 1 0
7 2016-01-08 2016 1 0
8 2016-01-09 2016 1 0
9 2016-01-10 2016 1 0
10 2016-01-11 2016 1 0
11 2016-01-12 2016 1 0
12 2016-01-13 2016 1 0
13 2016-01-14 2016 1 0
14 2016-01-15 2016 1 0
15 2016-01-16 2016 1 0
16 2016-01-17 2016 1 0
17 2016-01-18 2016 1 0
18 2016-01-19 2016 1 0
19 2016-01-20 2016 1 0
20 2016-01-21 2016 1 0
21 2016-01-22 2016 1 0
22 2016-01-23 2016 1 0
23 2016-01-24 2016 1 0
24 2016-01-25 2016 1 0
25 2016-01-26 2016 1 0
26 2016-01-27 2016 1 0
27 2016-01-28 2016 1 0
28 2016-01-29 2016 1 0
29 2016-01-30 2016 1 0
.. ... ... ... ...
31 2016-02-01 2016 2 0
32 2016-02-02 2016 2 0
33 2016-02-03 2016 2 0
34 2016-02-04 2016 2 0
35 2016-02-05 2016 2 0
36 2016-02-06 2016 2 0
37 2016-02-07 2016 2 0
38 2016-02-08 2016 2 0
39 2016-02-09 2016 2 0
40 2016-02-10 2016 2 0
41 2016-02-11 2016 2 0
42 2016-02-12 2016 2 0
43 2016-02-13 2016 2 0
44 2016-02-14 2016 2 0
45 2016-02-15 2016 2 0
46 2016-02-16 2016 2 0
47 2016-02-17 2016 2 0
48 2016-02-18 2016 2 0
49 2016-02-19 2016 2 0
50 2016-02-20 2016 2 0
51 2016-02-21 2016 2 0
52 2016-02-22 2016 2 0
53 2016-02-23 2016 2 0
54 2016-02-24 2016 2 0
55 2016-02-25 2016 2 0
56 2016-02-26 2016 2 0
57 2016-02-27 2016 2 0
58 2016-02-28 2016 2 0
59 2016-02-29 2016 2 0
60 2016-03-01 2016 3 0
所以在每日数据框的 Val列我希望在该月的天数中重复每月数据框的 Val。
So in the 'Val' column of Daily Dataframe I want the "Val" from the Monthly Dataframe to be repeated for the number of days in that month.
Expected Output
Date Year Month Val
0 2016-01-01 2016 1 0
1 2016-01-02 2016 1 0
2 2016-01-03 2016 1 0
3 2016-01-04 2016 1 0
4 2016-01-05 2016 1 0
5 2016-01-06 2016 1 0
6 2016-01-07 2016 1 0
7 2016-01-08 2016 1 0
8 2016-01-09 2016 1 0
.. ... ... ... ...
10 2016-01-11 2016 1 0
11 2016-01-12 2016 1 0
12 2016-01-13 2016 1 0
13 2016-01-14 2016 1 0
14 2016-01-15 2016 1 0
15 2016-01-16 2016 1 0
16 2016-01-17 2016 1 0
17 2016-01-18 2016 1 0
18 2016-01-19 2016 1 0
19 2016-01-20 2016 1 0
20 2016-01-21 2016 1 0
21 2016-01-22 2016 1 0
22 2016-01-23 2016 1 0
23 2016-01-24 2016 1 0
24 2016-01-25 2016 1 0
25 2016-01-26 2016 1 0
26 2016-01-27 2016 1 0
27 2016-01-28 2016 1 0
28 2016-01-29 2016 1 0
29 2016-01-30 2016 1 0
.. ... ... ... ...
41 2016-02-11 2016 2 0.10
42 2016-02-12 2016 2 0.10
43 2016-02-13 2016 2 0.10
44 2016-02-14 2016 2 0.10
45 2016-02-15 2016 2 0.10
46 2016-02-16 2016 2 0.10
47 2016-02-17 2016 2 0.10
.. ... ... ... ...
49 2016-03-19 2016 3 0.07
50 2016-03-20 2016 3 0.07
51 2016-03-21 2016 3 0.07
52 2016-03-22 2016 3 0.07
53 2016-03-23 2016 3 0.07
54 2016-03-24 2016 3 0.07
推荐答案
为 @Merlin已经提到加入(使用 pd.merge()方法)应该非常简单:
As @Merlin has already mentioned joining (using pd.merge() method) should be pretty straightforward:
In [126]: pd.merge(daily.drop('Val', 1), monthly.drop('Date', 1), on=['Year','Month'])
Out[126]:
Date Year Month val Val
0 2016-01-01 2016 1 0 0.00
1 2016-01-02 2016 1 0 0.00
2 2016-01-03 2016 1 0 0.00
3 2016-01-04 2016 1 0 0.00
4 2016-01-05 2016 1 0 0.00
5 2016-01-06 2016 1 0 0.00
6 2016-01-07 2016 1 0 0.00
7 2016-01-08 2016 1 0 0.00
8 2016-01-09 2016 1 0 0.00
9 2016-01-10 2016 1 0 0.00
10 2016-01-11 2016 1 0 0.00
11 2016-01-12 2016 1 0 0.00
12 2016-01-13 2016 1 0 0.00
13 2016-01-14 2016 1 0 0.00
14 2016-01-15 2016 1 0 0.00
.. ... ... ... ... ...
137 2016-05-17 2016 5 0 0.28
138 2016-05-18 2016 5 0 0.28
139 2016-05-19 2016 5 0 0.28
140 2016-05-20 2016 5 0 0.28
141 2016-05-21 2016 5 0 0.28
142 2016-05-22 2016 5 0 0.28
143 2016-05-23 2016 5 0 0.28
144 2016-05-24 2016 5 0 0.28
145 2016-05-25 2016 5 0 0.28
146 2016-05-26 2016 5 0 0.28
147 2016-05-27 2016 5 0 0.28
148 2016-05-28 2016 5 0 0.28
149 2016-05-29 2016 5 0 0.28
150 2016-05-30 2016 5 0 0.28
151 2016-05-31 2016 5 0 0.28
[152 rows x 5 columns]
我想为您提供更具挑战性的任务-生成您想要的DF仅来自每月DF:
In [108]: df
Out[108]:
Val Date Year Month
0 0.00 2016-01-31 2016 1
1 0.10 2016-02-28 2016 2
2 0.07 2016-03-31 2016 3
3 0.01 2016-04-30 2016 4
4 0.28 2016-05-31 2016 5
In [117]: df.set_index('Date').resample('MS').mean().append(x.iloc[[-1]]).resample('D').pad().reset_index()
Out[117]:
Date Val Year Month
0 2016-01-01 0.00 2016 1
1 2016-01-02 0.00 2016 1
2 2016-01-03 0.00 2016 1
3 2016-01-04 0.00 2016 1
4 2016-01-05 0.00 2016 1
5 2016-01-06 0.00 2016 1
6 2016-01-07 0.00 2016 1
7 2016-01-08 0.00 2016 1
8 2016-01-09 0.00 2016 1
9 2016-01-10 0.00 2016 1
10 2016-01-11 0.00 2016 1
11 2016-01-12 0.00 2016 1
12 2016-01-13 0.00 2016 1
13 2016-01-14 0.00 2016 1
14 2016-01-15 0.00 2016 1
.. ... ... ... ...
137 2016-05-17 0.28 2016 5
138 2016-05-18 0.28 2016 5
139 2016-05-19 0.28 2016 5
140 2016-05-20 0.28 2016 5
141 2016-05-21 0.28 2016 5
142 2016-05-22 0.28 2016 5
143 2016-05-23 0.28 2016 5
144 2016-05-24 0.28 2016 5
145 2016-05-25 0.28 2016 5
146 2016-05-26 0.28 2016 5
147 2016-05-27 0.28 2016 5
148 2016-05-28 0.28 2016 5
149 2016-05-29 0.28 2016 5
150 2016-05-30 0.28 2016 5
151 2016-05-31 0.28 2016 5
[152 rows x 4 columns]
说明:
重新采样 MonthlyDF
到月初
In [112]: df.set_index('Date').resample('MS').mean()
Out[112]:
Val Year Month
Date
2016-01-01 0.00 2016 1
2016-02-01 0.10 2016 2
2016-03-01 0.07 2016 3
2016-04-01 0.01 2016 4
2016-05-01 0.28 2016 5
从原始 MonthlyDF
:
In [113]: df.set_index('Date').resample('MS').mean().append(x.iloc[[-1]])
Out[113]:
Val Year Month
Date
2016-01-01 0.00 2016 1
2016-02-01 0.10 2016 2
2016-03-01 0.07 2016 3
2016-04-01 0.01 2016 4
2016-05-01 0.28 2016 5
2016-05-31 0.28 2016 5
之后,我们可以轻松地重新采样它使用每天
规则: D
after that we can easily resample it using daily
rule: D
这篇关于通过比较python中的行值将值从一个数据帧复制到另一数据帧(不同长度)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!