使用多个If-else的Pandas变量创建 [英] Pandas variable creation using multiple If-else

查看：1637 发布时间：2018/7/17 9:14:40 python if-statement pandas where

本文介绍了使用多个If-else的Pandas变量创建的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

需要Pandas多个IF-ELSE语句的帮助。我有一个测试数据集（泰坦尼克号）如下：

Need help with Pandas multiple IF-ELSE statements. I have a test dataset (titanic) as follows:

ID  Survived    Pclass  Name    Sex Age
1   0   3   Braund  male    22
2   1   1   Cumings, Mrs.   female  38
3   1   3   Heikkinen, Miss. Laina  female  26
4   1   1   Futrelle, Mrs.  female  35
5   0   3   Allen, Mr.  male    35
6   0   3   Moran, Mr.  male    
7   0   1   McCarthy, Mr.   male    54
8   0   3   Palsson, Master male    2

其中Id是乘客ID。我想在此数据框中创建一个新的标志变量，该变量具有以下规则：

where Id is the passenger id. I want to create a new flag variable in this data frame which has the following rule:

if Sex=="female" or (Pclass==1 and Age <18) then 1 else 0.

现在这样做我尝试了一些方法。这就是我首先接近的方式：

Now to do this I tried a few approaches. This is how I approached first:

df=pd.read_csv(data.csv)
for passenger_index,passenger in df.iterrows():
    if passenger['Sex']=="female" or (passenger['Pclass']==1 and passenger['Age']<18):
       df['Prediction']=1
    else:
       df['Prediction']=0

上面代码的问题是它在df中创建一个Prediction变量，但是所有值都为0.

The problem with above code is that it creates a Prediction variable in df but with all values as 0.

但是如果我使用相同的代码而是输出它给了一个字典，它给出了正确答案，如下所示：

However if I use the same code but instead output it to a dictionary it gives the right answer as shown below:

prediction={}
df=pd.read_csv(data.csv)
for passenger_index,passenger in df.iterrows():
    if passenger['Sex']=="female" or (passenger['Pclass']==1 and passenger['Age']<18):
       prediction[passenger['ID']=1
    else:
       prediction[passenger['ID']=0

这给出了一个dict预测根据上述逻辑，键为ID，值为1或0。

This gives a dict prediction with keys as ID and values as 1 or 0 based on the above logic.

那么为什么df变量工作错误？我甚至尝试先定义一个函数然后调用它。和第一个一样。

So why the df variable works wrongly?. I even tried by first defining a function and then calling it. Gave the same ans as first.

那么，我们怎么能在熊猫中做到这一点？

So, how can we do this in pandas?.

其次，如果我们可以使用多个if-else语句，我想也可以这样做。我知道np.where但它不允许添加'和'条件。所以这就是我的尝试：

Secondly, I guess the same can be done if we can just use some multiple if-else statements. I know np.where but it is not allowing to add 'and' condition. So here is what I was trying:

df['Prediction']=np.where(df['Sex']=="female",1,np.where((df['Pclass']==1 and df['Age']<18),1,0)

上面的'和'关键字出现了错误。

The above gave an error for 'and' keyword in where.

那么有人可以提供帮助吗？使用np.where（简单的if-else之类）和使用某些函数（applymap等）或修改我之前写的内容的多个方法的解决方案将非常感激。

So can someone help?. Solutions with multiple approache using np.where(simple if-else like) and using some function(applymap etc) or modifications to what I wrote earlier would be really appreciated.

另外我们如何使用df的一些applymap或apply / map方法做同样的事情。

Also how do we do the same using some applymap or apply/map method of df?.

推荐答案

而不是循环遍历行使用 df.iterrows （相对较慢），您可以在一个作业中将所需的值分配给 Prediction 列：

Instead of looping through the rows using df.iterrows (which is relatively slow), you can assign the desired values to the Prediction column in one assignment:

In [27]: df['Prediction'] = ((df['Sex']=='female') | ((df['Pclass']==1) & (df['Age']<18))).astype('int')

In [29]: df['Prediction']
Out[29]: 
0    0
1    1
2    1
3    1
4    0
5    0
6    0
7    0
Name: Prediction, dtype: int32

< hr>

对于您的第一种方法，请记住 df ['Prediction'] 表示整个列df ，所以 df ['Prediction'] = 1 将值1分配给该列中的每一行。由于 df ['Prediction'] = 0 是最后一次分配，整个列最终都被填充为零。

For your first approach, remember that df['Prediction'] represents an entire column of df, so df['Prediction']=1 assigns the value 1 to each row in that column. Since df['Prediction']=0 was the last assignment, the entire column ended up being filled with zeros.

对于第二种方法，请注意您需要使用& 而不是和来执行元素两个NumPy阵列或Pandas NDFrame上的逻辑和操作。因此，您可以使用

For your second approach, note that you need to use & not and to perform an elementwise logical-and operation on two NumPy arrays or Pandas NDFrames. Thus, you could use

In [32]: np.where(df['Sex']=='female', 1, np.where((df['Pclass']==1)&(df['Age']<18), 1, 0))
Out[32]: array([0, 1, 1, 1, 0, 0, 0, 0])

虽然我觉得它很多更简单地使用 | 用于逻辑 - 和& 用于逻辑 - 和：

though I think it is much simpler to just use | for logical-or and & for logical-and:

In [34]: ((df['Sex']=='female') | ((df['Pclass']==1) & (df['Age']<18)))
Out[34]: 
0    False
1     True
2     True
3     True
4    False
5    False
6    False
7    False
dtype: bool

这篇关于使用多个If-else的Pandas变量创建的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用多个If-else的Pandas变量创建 [英] Pandas variable creation using multiple If-else

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用多个If-else的Pandas变量创建 [英] Pandas variable creation using multiple If-else

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭