在python字典中删除带有nan值的条目 [英] remove entries with nan values in python dictionary
问题描述
我有傻瓜. python中的字典:
I have the foll. dictionary in python:
OrderedDict([(30, ('A1', 55.0)), (31, ('A2', 125.0)), (32, ('A3', 180.0)), (43, ('A4', nan))])
是否可以删除任何值为NaN的条目?我试过了:
Is there a way to remove the entries where any of the values is NaN? I tried this:
{k: dict_cg[k] for k in dict_cg.values() if not np.isnan(k)}
如果soln同时适用于python 2和python 3,那就太好了
It would be great if the soln works for both python 2 and python 3
推荐答案
由于您拥有熊猫,因此您可以在此处利用熊猫的pd.Series.notnull
函数,该函数可用于混合dtypes.
Since you have pandas, you can leverage pandas' pd.Series.notnull
function here, which works with mixed dtypes.
>>> import pandas as pd
>>> {k: v for k, v in dict_cg.items() if pd.Series(v).notna().all()}
{30: ('A1', 55.0), 31: ('A2', 125.0), 32: ('A3', 180.0)}
这不是答案的一部分,但可以帮助您了解我是如何找到解决方案的.尝试直接使用pd.notnull
尝试解决此问题时,我遇到了一些奇怪的行为.
This is not part of the answer, but may help you understand how I've arrived at the solution. I came across some weird behaviour when trying to solve this question, using pd.notnull
directly.
以dict_cg[43]
.
>>> dict_cg[43]
('A4', nan)
pd.notnull
不起作用.
>>> pd.notnull(dict_cg[43])
True
它将元组视为单个值(而不是值的可迭代).此外,将其转换为列表然后进行测试也会给出错误的答案.
It treats the tuple as a single value (rather than an iterable of values). Furthermore, converting this to a list and then testing also gives an incorrect answer.
>>> pd.notnull(list(dict_cg[43]))
array([ True, True])
由于第二个值是nan
,所以我要查找的结果应该是[True, False]
.预转换为系列后,它终于可以工作了:
Since the second value is nan
, the result I'm looking for should be [True, False]
. It finally works when you pre-convert to a Series:
>>> pd.Series(dict_cg[43]).notnull()
0 True
1 False
dtype: bool
因此,解决方案是对其进行序列化,然后测试值.
So, the solution is to Series-ify it and then test the values.
沿着相似的线,另一种(公认的环形交叉路)解决方案是将其预转换为object
dtype numpy数组,并且pd.notnull
将直接起作用:
Along similar lines, another (admittedly roundabout) solution is to pre-convert to an object
dtype numpy array, and pd.notnull
will work directly:
>>> pd.notnull(np.array(dict_cg[43], dtype=object))
Out[151]: array([True, False])
我想像pd.notnull
直接将dict_cg[43]
转换为幕后的字符串数组,将NaN呈现为字符串"nan",因此不再是"null"值.
I imagine that pd.notnull
directly converts dict_cg[43]
to a string array under the covers, rendering the NaN as a string "nan", so it is no longer a "null" value.
这篇关于在python字典中删除带有nan值的条目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!