在python字典中删除带有nan值的条目 [英] remove entries with nan values in python dictionary

查看:914
本文介绍了在python字典中删除带有nan值的条目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有傻瓜. python中的字典:

I have the foll. dictionary in python:

OrderedDict([(30, ('A1', 55.0)), (31, ('A2', 125.0)), (32, ('A3', 180.0)), (43, ('A4', nan))])

是否可以删除任何值为NaN的条目?我试过了:

Is there a way to remove the entries where any of the values is NaN? I tried this:

{k: dict_cg[k] for k in dict_cg.values() if not np.isnan(k)}

如果soln同时适用于python 2和python 3,那就太好了

It would be great if the soln works for both python 2 and python 3

推荐答案

由于您拥有熊猫,因此您可以在此处利用熊猫的pd.Series.notnull函数,该函数可用于混合dtypes.

Since you have pandas, you can leverage pandas' pd.Series.notnull function here, which works with mixed dtypes.

>>> import pandas as pd
>>> {k: v for k, v in dict_cg.items() if pd.Series(v).notna().all()}
{30: ('A1', 55.0), 31: ('A2', 125.0), 32: ('A3', 180.0)}


这不是答案的一部分,但可以帮助您了解我是如何找到解决方案的.尝试直接使用pd.notnull尝试解决此问题时,我遇到了一些奇怪的行为.


This is not part of the answer, but may help you understand how I've arrived at the solution. I came across some weird behaviour when trying to solve this question, using pd.notnull directly.

dict_cg[43].

>>> dict_cg[43]
('A4', nan)

pd.notnull不起作用.

>>> pd.notnull(dict_cg[43])
True

它将元组视为单个值(而不是值的可迭代).此外,将其转换为列表然后进行测试也会给出错误的答案.

It treats the tuple as a single value (rather than an iterable of values). Furthermore, converting this to a list and then testing also gives an incorrect answer.

>>> pd.notnull(list(dict_cg[43]))
array([ True,  True])

由于第二个值是nan,所以我要查找的结果应该是[True, False].预转换为系列后,它终于可以工作了:

Since the second value is nan, the result I'm looking for should be [True, False]. It finally works when you pre-convert to a Series:

>>> pd.Series(dict_cg[43]).notnull() 
0     True
1    False
dtype: bool

因此,解决方案是对其进行序列化,然后测试值.

So, the solution is to Series-ify it and then test the values.

沿着相似的线,另一种(公认的环形交叉路)解决方案是将其预转换为object dtype numpy数组,并且pd.notnull将直接起作用:

Along similar lines, another (admittedly roundabout) solution is to pre-convert to an object dtype numpy array, and pd.notnull will work directly:

>>> pd.notnull(np.array(dict_cg[43], dtype=object))
Out[151]: array([True,  False])

我想像pd.notnull直接将dict_cg[43]转换为幕后的字符串数组,将NaN呈现为字符串"nan",因此不再是"null"值.

I imagine that pd.notnull directly converts dict_cg[43] to a string array under the covers, rendering the NaN as a string "nan", so it is no longer a "null" value.

这篇关于在python字典中删除带有nan值的条目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆