Pandas:根据其他列值有条件地替换值 [英] Pandas: Conditionally replace values based on other columns values
问题描述
我有一个如下所示的数据框 (df):
I have a dataframe (df) that looks like this:
environment event
time
2017-04-28 13:08:22 NaN add_rd
2017-04-28 08:58:40 NaN add_rd
2017-05-03 07:59:35 test add_env
2017-05-03 08:05:14 prod add_env
...
现在我的目标是对于 event
列中的每个 add_rd
,在 environment
中关联的 NaN
-value> 列应替换为字符串 RD
.
Now my goal is for each add_rd
in the event
column, the associated NaN
-value in the environment
column should be replaced with a string RD
.
environment event
time
2017-04-28 13:08:22 RD add_rd
2017-04-28 08:58:40 RD add_rd
2017-05-03 07:59:35 test add_env
2017-05-03 08:05:14 prod add_env
...
<小时>
到目前为止我做了什么
我偶然发现了 df['environment'] = df['environment].fillna('RD')
替换了 every NaN
(这不是我要找的),检测缺失值的 pd.isnull(df['environment'])
和 np.where(df['environment'], x,y)
这似乎是我想要的,但不起作用.我还尝试过这个:
I stumbled across df['environment'] = df['environment].fillna('RD')
which replaces every NaN
(which is not what I am looking for), pd.isnull(df['environment'])
which is detecting missing values and np.where(df['environment'], x,y)
which seems to be what I want but isn't working. Furthermore did I try this:
import pandas as pd
for env in df['environment']:
if pd.isnull(env) and df['event'] == 'add_rd':
env = 'RD'
缺少索引或某种迭代器来访问 event
列中的等效值.
我试过这个:
The indexes are missing or some kind of iterator to access the equivalent value in the event
column.
And I tried this:
df['environment'] = np.where(pd.isnull(df['environment']), df['environment'] = 'RD', df['environment'])
SyntaxError: keyword can't be an expression
这显然没有用.
我查看了几个问题,但无法以答案中的建议为基础.布莱克的问题 西蒙的问题a> szli 的问题 Jan Willems Tulp 的问题
I took a look at several questions but couldn't build on the suggestions in the answers. Black's question Simon's question szli's question Jan Willems Tulp's question
那么,如何根据另一列的值替换一列中的值?
So, how do I replace a value in a column based on another columns values?
推荐答案
现在我的目标是针对事件列中的每个 add_rd,关联的环境列中的 NaN 值应替换为字符串研发.
Now my goal is for each add_rd in the event column, the associated NaN-value in the environment column should be replaced with a string RD.
根据@Zero 的评论,使用 pd.DataFrame.loc
和布尔索引:
As per @Zero's comment, use pd.DataFrame.loc
and Boolean indexing:
df.loc[df['event'].eq('add_rd') & df['environment'].isnull(), 'environment'] = 'RD'
这篇关于Pandas:根据其他列值有条件地替换值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!