当询问Timestamp列值是否具有类型时, pandas 会给出不正确的结果 [英] Pandas gives incorrect result when asking if Timestamp column values have attr astype

查看:203
本文介绍了当询问Timestamp列值是否具有类型时, pandas 会给出不正确的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用包含 Timestamp 值的列,对于元素是否具有属性 astype ,我得到的结果不一致:

With a column containing Timestamp values, I am getting inconsistent results about whether the elements have the attribute astype:

In [30]: o.head().datetime.map(lambda x: hasattr(x, 'astype'))
Out[30]: 
0    False
1    False
2    False
3    False
4    False
Name: datetime, dtype: bool

In [31]: map(lambda x: hasattr(x, 'astype'), o.head().datetime.values)
Out[31]: [True, True, True, True, True]

In [32]: o.datetime.dtype
Out[32]: dtype('<M8[ns]')

In [33]: o.datetime.head()
Out[33]: 
0   2012-09-30 22:00:15.003000
1   2012-09-30 22:00:16.203000
2   2012-09-30 22:00:18.302000
3   2012-09-30 22:03:37.304000
4   2012-09-30 22:05:17.103000
Name: datetime, dtype: datetime64[ns]

如果我选择了第一个元素(或任何单个元素),并询问是否有attr astype ,I看到它是,我甚至可以转换为其他格式。

If I pick off the first element (or any single element) and ask if it has attr astype, I see that it does, and I even can convert to other formats.

但是,如果我一次性输入到整个列,使用 Series.map ,我收到一条错误,声称 Timestamp 对象没有属性 astype (虽然他们清楚地做到)。

But if I type to do this to the entire column in one go, with Series.map, I get an error claiming that Timestamp objects do not have the attribute astype (though they clearly do).

如何使用Pandas将操作映射到列?这是一个已知的错误吗?

How can I achieve mapping the operation to the column with Pandas? Is this a known error?

版本:pandas 0.13.0,numpy 1.8

Version: pandas 0.13.0, numpy 1.8

/ strong>

Added

它似乎是大熊猫或numpy的某种隐式投射:

It appears to be some sort of implicit casting on the part of either pandas or numpy:

In [50]: hasattr(o.head().datetime[0], 'astype')
Out[50]: False

In [51]: hasattr(o.head().datetime.values[0], 'astype')
Out[51]: True


推荐答案

时间戳没有astype方法。但是numpy.datetime64的做法。

Timestamps do not have an astype method. But numpy.datetime64's do.

NDFrame.values 返回一个numpy数组
o .head()。datetime.values 返回一个numtype数组dtype numpy.datetime64 ,这就是为什么

NDFrame.values returns a numpy array. o.head().datetime.values returns a numpy array of dtype numpy.datetime64, which is why

In [31]: map(lambda x: hasattr(x, 'astype'), o.head().datetime.values)
Out[31]: [True, True, True, True, True]






请注意, Series .__ iter __ 这样定义

def __iter__(self):
    if  com.is_categorical_dtype(self.dtype):
        return iter(self.values)
    elif np.issubdtype(self.dtype, np.datetime64):
        return (lib.Timestamp(x) for x in self.values)
    elif np.issubdtype(self.dtype, np.timedelta64):
        return (lib.Timedelta(x) for x in self.values)
    else:
        return iter(self.values)

因此,当系列的dtype为 np.datetime64 时,系列
上的迭代将返回Timestamps。这是隐式转换的地方。

So, when the dtype of the Series is np.datetime64, iteration over the Series returns Timestamps. This is where the implicit conversion takes place.

这篇关于当询问Timestamp列值是否具有类型时, pandas 会给出不正确的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆