NumPy或Pandas:在具有NaN值的同时将数组类型保持为整数 [英] NumPy or Pandas: Keeping array type as integer while having a NaN value

查看:1693
本文介绍了NumPy或Pandas:在具有NaN值的同时将数组类型保持为整数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有一种首选的方法可以将 numpy 数组的数据类型固定为 int (或 int64 或者其他什么),同时仍然有一个元素列在 numpy.NaN

Is there a preferred way to keep the data type of a numpy array fixed as int (or int64 or whatever), while still having an element inside listed as numpy.NaN?

特别是,我正在将内部数据结构转换为Pandas DataFrame。在我们的结构中,我们有整数类型的列仍然具有NaN(但列的dtype是int)。如果我们把它变成一个DataFrame,它似乎将所有内容重新设置为float,但我们真的希望 int

In particular, I am converting an in-house data structure to a Pandas DataFrame. In our structure, we have integer-type columns that still have NaN's (but the dtype of the column is int). It seems to recast everything as a float if we make this a DataFrame, but we'd really like to be int.

想法?

尝试的事情:

我尝试使用 from_records() pandas.DataFrame下的函数, coerce_float = False ,这没有帮助。我也尝试使用带有NaN fill_value的NumPy蒙版数组,这也没有用。所有这些导致列数据类型变为浮点数。

I tried using the from_records() function under pandas.DataFrame, with coerce_float=False and this did not help. I also tried using NumPy masked arrays, with NaN fill_value, which also did not work. All of these caused the column data type to become a float.

推荐答案

NaN 不能存储在整数数组中。这是目前大熊猫的一个已知限制;我一直在等待NumPy中NA值的进展(类似于R中的NAs),但是在NumPy获得这些功能之前至少需要6个月到一年,似乎:

NaN can't be stored in an integer array. This is a known limitation of pandas at the moment; I have been waiting for progress to be made with NA values in NumPy (similar to NAs in R), but it will be at least 6 months to a year before NumPy gets these features, it seems:

http:// pandas.pydata.org/pandas-docs/stable/gotchas.html#support-for-integer-na

这篇关于NumPy或Pandas:在具有NaN值的同时将数组类型保持为整数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆