有条件地对 pandas 数据框执行计算 [英] Conditionally perform calculation on pandas dataframe
本文介绍了有条件地对 pandas 数据框执行计算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
time_period total_cost total_revenue
7days 150 250
14days 350 600
30days 900 750
7days 180 400
14days 430 620
鉴于此数据,我想将total_cost和total_revenue列转换为给定时间段内的平均值.我认为这会起作用:
Given this data, I want to convert the total_cost and total_revenue columns into averages for their given time period. I thought this would work:
df[['total_cost','total_revenue']][df.time_period]=="7days"]=df[['total_cost','total_revenue']][df.time_period]=="7days"]/7
但是它返回的数据帧保持不变.
But it returns the dataframe unchanged.
推荐答案
我相信您正在对数据框的副本进行操作.我认为您应该使用apply
:
I believe that you are operating on copies of the dataframe. I think you should use apply
:
from StringIO import StringIO
import pandas
datastring = StringIO("""\
time_period total_cost total_revenue
7days 150 250
14days 350 600
30days 900 750
7days 180 400
14days 430 620
""")
data = pandas.read_table(datastring, sep='\s\s+')
data['total_cost_avg'] = data.apply(
lambda row: row['total_cost'] / float(row['time_period'][:-4]),
axis=1
)
给我:
time_period total_cost total_revenue total_cost_avg
0 7days 150 250 21.428571
1 14days 350 600 25.000000
2 30days 900 750 30.000000
3 7days 180 400 25.714286
4 14days 430 620 30.714286
这篇关于有条件地对 pandas 数据框执行计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文