python:带有多个条件的pandas np.where与df.loc [英] python: pandas np.where vs. df.loc with multiple conditions
问题描述
Np.where一直给我带来很多错误,因此我正在寻找使用df.loc的解决方案.
Np.where has been giving me a lot of errors, so I am looking for a solution with df.loc instead.
这是我一直在得到的np.where错误:
This is the np.where error I have been getting:
C:\Users\xxx\AppData\Local\Continuum\Anaconda2\lib\site-packages\ipykernel\__main__.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
if __name__ == '__main__':
我正在使用以下数据框df:
I am working with the following dataframe df:
df = pd.DataFrame({'Column_A': ['AAA','AAA','ABC','CDE'],'checked': ['0','0','1','0'],'duplicate': ['True','True','False','False']})
Column_A checked duplicate
0 AAA 0 True
1 AAA 0 True
2 ABC 1 False
3 CDE 0 False
如果要检查是否为0且重复项为True,我想创建一个附加标志.
I want to create an additional flag, if checked is 0 and duplicate is True.
我尝试了一下,但没有成功:
I tried this and it didn't work:
df['flag'] = (np.where((df['checked'] == 'Y') &(df['duplicate'] == 'True'), 'Y', '0'))
TypeError: invalid type comparison
我用df.loc尝试过:
I tried it with df.loc:
df['flag'] = (df.loc[df['checked'] == 'Y']& df.loc[df['duplicate'] == 'True'], 'Y','0')
TypeError: invalid type comparison
我得到同样的错误!
推荐答案
我认为您的boolean
不是string
,因此需要删除'
:
I think your boolean
are not string
s, so need remove '
:
df = pd.DataFrame({'Column_A': ['AAA','AAA','ABC','CDE'],
'checked': ['0','0','1','0'],
'duplicate': [True, True, False, False]})
df['flag'] = np.where((df['checked'] == 'Y') &(df['duplicate'] == True), 'Y', '0')
print (df)
Column_A checked duplicate flag
0 AAA 0 True 0
1 AAA 0 True 0
2 ABC 1 False 0
3 CDE 0 False 0
或者如果与boolean
列进行比较,则可以省略== True
:
Or if compare with boolean
column, == True
can be omited:
df['flag'] = np.where((df['checked'] == 'Y') &(df['duplicate']), 'Y', '0')
print (df)
Column_A checked duplicate flag
0 AAA 0 True 0
1 AAA 0 True 0
2 ABC 1 False 0
3 CDE 0 False 0
也需要检查checked
需要'
,因为strings
:
Also if need check checked
need '
because strings
:
df['flag'] = np.where((df['checked'] == '0') &(df['duplicate'] == True), 'Y', '0')
print (df)
Column_A checked duplicate flag
0 AAA 0 True Y
1 AAA 0 True Y
2 ABC 1 False 0
3 CDE 0 False 0
使用 loc
的解决方案:
Solution with loc
:
df['flag'] = '0'
mask = (df['checked'] == '0') &(df['duplicate'])
df.loc[mask, 'flag'] = 'Y'
print (df)
Column_A checked duplicate flag
0 AAA 0 True Y
1 AAA 0 True Y
2 ABC 1 False 0
3 CDE 0 False 0
这篇关于python:带有多个条件的pandas np.where与df.loc的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!