pandas drop_duplicates方法不起作用 [英] Pandas drop_duplicates method not working
问题描述
我正在尝试在我的数据框上使用drop_duplicates方法,但出现了 错误.请参阅以下内容:
I am trying to use drop_duplicates method on my dataframe, but I am getting an error. See the following:
错误:TypeError:无法散列的类型:列表"
error: TypeError: unhashable type: 'list'
我正在使用的代码:
df = db.drop_duplicates()
我的数据库很大,包含字符串,浮点数,日期,NaN,布尔值,整数...任何帮助.
My DB is huge and contains strings, floats, dates, NaN's, booleans, integers... Any help is appreciated.
推荐答案
drop_duplicates不适用于错误消息所暗示的数据框中的列表.但是,您可以将重复项放在转换为str的数据帧上,然后使用结果中的索引从原始df中提取行.
drop_duplicates won't work with lists in your dataframe as the error message implies. However, you can drop duplicates on the dataframe casted as str and then extract the rows from original df using the index from the results.
设置
df = pd.DataFrame({'Keyword': {0: 'apply', 1: 'apply', 2: 'apply', 3: 'terms', 4: 'terms'},
'X': {0: [1, 2], 1: [1, 2], 2: 'xy', 3: 'xx', 4: 'yy'},
'Y': {0: 'yy', 1: 'yy', 2: 'yx', 3: 'ix', 4: 'xi'}})
#Drop directly causes the same error
df.drop_duplicates()
Traceback (most recent call last):
...
TypeError: unhashable type: 'list'
解决方案
#convert hte df to str type, drop duplicates and then select the rows from original df.
df.loc[df.astype(str).drop_duplicates().index]
Out[205]:
Keyword X Y
0 apply [1, 2] yy
2 apply xy yx
3 terms xx ix
4 terms yy xi
#the list elements are still list in the final results.
df.loc[df.astype(str).drop_duplicates().index].loc[0,'X']
Out[207]: [1, 2]
用loc替换iloc.在这种情况下,两者都可以 索引与位置索引匹配,但不通用
replaced iloc with loc. In this particular case, both work as the index matches the positional index, but it is not general
这篇关于 pandas drop_duplicates方法不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!