pandas drop_duplicates方法不起作用 [英] Pandas drop_duplicates method not working

查看:664
本文介绍了 pandas drop_duplicates方法不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在我的数据框上使用drop_duplicates方法,但出现了 错误.请参阅以下内容:

I am trying to use drop_duplicates method on my dataframe, but I am getting an error. See the following:

错误:TypeError:无法散列的类型:列表"

error: TypeError: unhashable type: 'list'

我正在使用的代码:

df = db.drop_duplicates()

我的数据库很大,包含字符串,浮点数,日期,NaN,布尔值,整数...任何帮助.

My DB is huge and contains strings, floats, dates, NaN's, booleans, integers... Any help is appreciated.

推荐答案

drop_duplicates不适用于错误消息所暗示的数据框中的列表.但是,您可以将重复项放在转换为str的数据帧上,然后使用结果中的索引从原始df中提取行.

drop_duplicates won't work with lists in your dataframe as the error message implies. However, you can drop duplicates on the dataframe casted as str and then extract the rows from original df using the index from the results.

设置

df = pd.DataFrame({'Keyword': {0: 'apply', 1: 'apply', 2: 'apply', 3: 'terms', 4: 'terms'},
 'X': {0: [1, 2], 1: [1, 2], 2: 'xy', 3: 'xx', 4: 'yy'},
 'Y': {0: 'yy', 1: 'yy', 2: 'yx', 3: 'ix', 4: 'xi'}})

#Drop directly causes the same error
df.drop_duplicates()
Traceback (most recent call last):
...
TypeError: unhashable type: 'list'

解决方案

#convert hte df to str type, drop duplicates and then select the rows from original df.

df.loc[df.astype(str).drop_duplicates().index]
Out[205]: 
  Keyword       X   Y
0   apply  [1, 2]  yy
2   apply      xy  yx
3   terms      xx  ix
4   terms      yy  xi

#the list elements are still list in the final results.
df.loc[df.astype(str).drop_duplicates().index].loc[0,'X']
Out[207]: [1, 2]

用loc替换iloc.在这种情况下,两者都可以 索引与位置索引匹配,但不通用

replaced iloc with loc. In this particular case, both work as the index matches the positional index, but it is not general

这篇关于 pandas drop_duplicates方法不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆