Pandas/Python:根据另一列中的值设置一列的值 [英] Pandas/Python: Set value of one column based on value in another column
问题描述
我需要基于Pandas数据框中的另一列的值来设置一列的值.这是逻辑:
I need to set the value of one column based on the value of another in a Pandas dataframe. This is the logic:
if df['c1'] == 'Value':
df['c2'] = 10
else:
df['c2'] = df['c3']
我无法做到这一点,这就是简单地创建一个具有新值的列(或更改现有列的值:任一个对我有效).
I am unable to get this to do what I want, which is to simply create a column with new values (or change the value of an existing column: either one works for me).
如果我尝试运行上面的代码,或者将其编写为函数并使用apply方法,则会得到以下信息:
If I try to run the code above or if I write it as a function and use the apply method, I get the following:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
推荐答案
一种方法是将索引与.loc
一起使用.
one way to do this would be to use indexing with .loc
.
示例
在没有示例数据框的情况下,我将在此处进行补充:
In the absence of an example dataframe, I'll make one up here:
import numpy as np
import pandas as pd
df = pd.DataFrame({'c1': list('abcdefg')})
df.loc[5, 'c1'] = 'Value'
>>> df
c1
0 a
1 b
2 c
3 d
4 e
5 Value
6 g
假设您要创建一个新列 c2
,等效于c1
,但c1
是Value
的情况除外,在这种情况下,您希望将其分配给10:
Assuming you wanted to create a new column c2
, equivalent to c1
except where c1
is Value
, in which case, you would like to assign it to 10:
首先,您可以创建新列c2
,并使用以下两行之一将它们设置为与c1
等价的列(它们本质上执行相同的操作):
First, you could create a new column c2
, and set it to equivalent as c1
, using one of the following two lines (they essentially do the same thing):
df = df.assign(c2 = df['c1'])
# OR:
df['c2'] = df['c1']
然后,使用.loc
查找c1
等于'Value'
的所有索引,并在c2
中的这些索引处分配所需的值:
Then, find all the indices where c1
is equal to 'Value'
using .loc
, and assign your desired value in c2
at those indices:
df.loc[df['c1'] == 'Value', 'c2'] = 10
最终,您将得到以下结果:
And you end up with this:
>>> df
c1 c2
0 a a
1 b b
2 c c
3 d d
4 e e
5 Value 10
6 g g
如果按照您在问题中所建议的那样,有时您可能只是想替换现有列中的值,而不是创建新列,然后跳过该列的创建,并执行以下操作:
If, as you suggested in your question, you would perhaps sometimes just want to replace the values in the column you already have, rather than create a new column, then just skip the column creation, and do the following:
df['c1'].loc[df['c1'] == 'Value'] = 10
# or:
df.loc[df['c1'] == 'Value', 'c1'] = 10
给你
>>> df
c1
0 a
1 b
2 c
3 d
4 e
5 10
6 g
这篇关于Pandas/Python:根据另一列中的值设置一列的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!