循环到日期并将功能应用于 pandas 数据框 [英] looping into dates and apply function to pandas dataframe
问题描述
我试图检测事件发生的第一个日期:在产品A的数据框中(请参见数据透视表),我在2017-04-03首次存储了20个项目.
I'm trying to detect the first dates when an event occur: here in my dataframe for the product A (see pivot table) I have 20 items stored for the first time on 2017-04-03.
所以我想创建一个新变量调用new_var_2017-04-03来存储增量.另一方面,在第二天2017-04-04,我不介意该项目现在是50个而不是20个,我只想存储第一个事件
so I want to create a new variable calle new_var_2017-04-03 that store the increment. On the other hand on the next day 2017-04-04 I don't mind if the item is now 50 instead of 20, I only want to store only the 1st event
它给了我几个错误,我想至少知道它背后的整个逻辑是否有意义,这是"pythonic",还是我以错误的方式完全搞错了
It gives me several errors, I would like to know at least if the entire logic behind it makes sense, it's "pythonic", or if I'm completeley on the wrong way
raw_data = {'name': ['B','A','A','B'],'date' : pd.to_datetime(pd.Series(['2017-03-30','2017-03-31','2017-04-03','2017-04-04'])),
'age': [10,20,50,30]}
df1 = pd.DataFrame(raw_data, columns = ['date','name','age'])
table=pd.pivot_table(df1,index=['name'],columns=['date'],values=['age'],aggfunc='sum')
table
我将日期传递到列表中
dates=df1['date'].values.tolist()
我想向后循环进入我的列表日期",并在发生事件时创建一个变量. 伪代码:对于i-1,我的意思是列表中位于i之前的项目
I want to do a backward loop into my list "dates" and create a variable if an event occurs. pseudo code: with i-1 I mean the item before i in the list
def my_fun(x,list):
for i in reversed(list):
if (x[i]-x[i-1])>0 :
x[new_var+i]=x[i]-x[i-1]
else:
x[new_var+i]=0
return x
print (df.apply(lambda x: my_fun(x,dates), axis=1))
所需的输出:
raw_data2 = {'new_var': ['new_var_2017-03-30','new_var_2017-03-31','new_var_2017-04-03','new_var_2017-04-04'],'result_a': [np.nan,20,np.nan,np.nan],'result_b': [10,np.nan,np.nan,np.nan]}
df2= pd.DataFrame(raw_data2, columns = ['new_var','result_a','result_b'])
df2.T
推荐答案
让我们尝试一下:
df1['age'] = df1.groupby('name')['age'].transform(lambda x: (x==x.min())*x)
df1.pivot_table(index='name', columns='date', values='age').replace(0,np.nan)
date 2017-03-30 2017-03-31 2017-04-03 2017-04-04
name
A NaN 20.0 NaN NaN
B 10.0 NaN NaN NaN
这篇关于循环到日期并将功能应用于 pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!