pandas 有条件地创建数据框列:基于多个条件 [英] Pandas conditional creation of a dataframe column: based on multiple conditions
本文介绍了 pandas 有条件地创建数据框列:基于多个条件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个df:
col1 col2 col3
0 1 2 3
1 2 3 1
2 3 3 3
3 4 3 2
我想根据以下条件添加新列:
I want to add a new column based on the following conditions:
- if col1 > col2 > col3 -----> 2
- elif col1 > col2 -----> 1
- elif col1 < col2 < col3 -----> -2
- elif col1 < col2 -----> -1
- else -----> 0
它应该变成这样:
col1 col2 col3 new
0 1 2 3 -2
1 2 3 1 -1
2 3 3 3 0
3 4 3 2 2
我遵循 unutbu的帖子,大于或小于1的都可以。但在我的情况下,如果大于或小于,则条件返回错误:
I followed the method from this post by unutbu, with 1 greater than or less than is fine. But in my case with more than 1 greater than or less than, conditions returns error:
conditions = [
(df['col1'] > df['col2'] > df['col3']),
(df['col1'] > df['col2']),
(df['col1'] < df['col2'] < df['col3']),
(df['col1'] < df['col2'])]
choices = [2,1,-2,-1]
df['new'] = np.select(conditions, choices, default=0)
Traceback (most recent call last):
File "<ipython-input-43-768a4c0ecf9f>", line 2, in <module>
(df['col1'] > df['col2'] > df['col3']),
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py", line 1478, in __nonzero__
.format(self.__class__.__name__))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
我应该怎么做?
推荐答案
将代码更改为
conditions = [
(df['col1'] > df['col2']) & (df['col2'] > df['col3']),
(df['col1'] > df['col2']),
(df['col1'] < df['col2']) & (df['col2'] < df['col3']),
(df['col1'] < df['col2'])]
choices = [2,1,-2,-1]
df['new'] = np.select(conditions, choices, default=0)
这篇关于 pandas 有条件地创建数据框列:基于多个条件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文