使用前一个“行"的值在 pandas 系列中 [英] Using the values of a previous "row" in a pandas series
问题描述
我有一个看起来像这样的 CSV(当带入一个带有read_csv()
,看起来一样).
I have a CSV that looks like this (and when brought into a pandas Dataframe with
read_csv()
, it looks the same).
我想根据以下逻辑更新 ad_requests 列中的值:
I want to update the values in column ad_requests according to the following logic:
对于给定的行,如果 ad_requests 有值,请不要管它.否则,将前一行的 ad_requests 值减去前一行的 impressions 值.所以在第一个例子中,我们希望得到:
For a given row, if ad_requests has a value, leave it alone. Else, give it a value of the previous row's value for ad_requests minus the previous row's value for impressions. So in the first example, we would like to end up with:
我部分到达那里:
df["ad_requests"] = [i if not pd.isnull(i) else ??? for i in df["ad_requests"]]
这就是我卡住的地方.在 else
之后,我想返回"并访问前一个行",尽管我知道这不是 Pandas 的使用方式.另一件要注意的事情是,行将始终按列ad_tag_name 分为三组.如果我 pd.groupby["ad_tag_name"]
,然后我可以把它变成一个 list
并开始切片和索引,但同样,我认为必须有更好的方法在熊猫中做到这一点(因为有很多事情).
And this is where I get stuck. After the else
, I want to "go back" and access the previous "row", though I know that this is not how pandas is meant to be used.
Another thing to note that is the rows will always be grouped in threes, by column ad_tag_name. If I pd.groupby["ad_tag_name"]
, I can then turn this into a list
and start slicing and indexing, but again, I think there must be a better way to do this in pandas (as there is many things).
Python:2.7.10
Python: 2.7.10
熊猫:0.18.0
推荐答案
你会想要做这样的事情:
You'll want to do something like this:
pd.options.mode.chained_assignment = None #suppresses "SettingWithCopyWarning"
for index, elem in enumerate(df['ad_requests']):
if pd.isnull(elem):
df['ad_requests'][index]=df['ad_requests'][index-1]-df['impressions'][index-1]
警告来自这样一个事实,即我们正在更改数据框视图的值,这会影响原始数据框.然而,这正是我们想要做的,所以我们并不真正关心.
The warning comes from the fact that we're changing the values of a view of a dataframe, which affects the original dataframe. That is what we wish to do, however, so it doesn't really concern us.
(Python 2.7.12 和 Pandas 0.19.0)
(Python 2.7.12 and Pandas 0.19.0)
修改最后一行代码
df['ad_requests'][index]=df['ad_requests'][index-1]-df['impressions'][index-1]
到
df.at[index,'ad_requests']=df.at[index-1,'ad_requests']-df.at[index-1,'impressions']
消除了抑制任何警告的需要:
removes the need to suppress any warnings:
for index, elem in enumerate(df['ad_requests']):
if pd.isnull(elem):
df.at[index,'ad_requests']=df.at[index-1,'ad_requests']-df.at[index-1,'impressions']
这篇关于使用前一个“行"的值在 pandas 系列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!