pandas 能识别出什么类型的dtypes? [英] what are all the dtypes that pandas recognizes?
问题描述
对于大熊猫,有人知道吗,除了
For pandas, would anyone know, if any datatype apart from
(i)float64
,int64
(以及np.number
的其他变体,例如float32
,int8
等)
(i) float64
, int64
(and other variants of np.number
like float32
, int8
etc.)
(ii)bool
(iii)datetime64
,timedelta64
(例如字符串列)始终具有object
的dtype
?
such as string columns, always have a dtype
of object
?
或者,我想知道,除了上面列表中的(i),(ii)和(iii)之外,是否还有其他数据类型,pandas
不会使它成为dtype
和object
? >
Alternatively, I want to know, if there are any datatype apart from (i), (ii) and (iii) in the list above that pandas
does not make it's dtype
an object
?
推荐答案
在熊猫1.0.0版本发布后2020年2月
Pandas对于每个Series大多使用NumPy数组和dtype(数据帧是Series的集合,每个都有自己的dtype). NumPy的文档进一步说明了 dtype ,数据类型和 dtypes 上的熊猫文档具有一个很多其他信息.
Pandas mostly uses NumPy arrays and dtypes for each Series (a dataframe is a collection of Series, each which can have its own dtype). NumPy's documentation further explains dtype, data types, and data type objects. In addition, the answer provided by @lcameron05 provides an excellent description of the numpy dtypes. Furthermore, the pandas docs on dtypes have a lot of additional information.
存储在pandas对象中的主要类型是float,int,bool, datetime64 [ns],timedelta [ns]和对象.另外这些dtypes 具有商品尺寸,例如int64和int32.
The main types stored in pandas objects are float, int, bool, datetime64[ns], timedelta[ns], and object. In addition these dtypes have item sizes, e.g. int64 and int32.
默认情况下,整数类型为int64,浮点类型为float64, 平台无关(32位或64位).以下将全部 导致int64 dtypes.
By default integer types are int64 and float types are float64, REGARDLESS of platform (32-bit or 64-bit). The following will all result in int64 dtypes.
Numpy,但是在创建时会选择依赖于平台的类型
数组.以下WILL会在32位平台上生成int32.
熊猫1.0.0版的主要更改之一是引入了pd.NA
来表示标量缺失值(而不是先前的np.nan
,pd.NaT
或None
值,具体取决于用法).
Numpy, however will choose platform-dependent types when creating
arrays. The following WILL result in int32 on 32-bit platform.
One of the major changes to version 1.0.0 of pandas is the introduction of pd.NA
to represent scalar missing values (rather than the previous values of np.nan
, pd.NaT
or None
, depending on usage).
Pandas扩展了NumPy的类型系统,还允许用户在扩展类型.以下列出了所有的熊猫扩展名类型.
Pandas extends NumPy's type system and also allows users to write their on extension types. The following lists all of pandas extension types.
数据种类:tz感知的日期时间(请注意,NumPy不支持时区感知的日期时间).
Kind of data: tz-aware datetime (note that NumPy does not support timezone-aware datetimes).
数据类型: DatetimeTZDtype
标量:时间戳
Array: arrays.DatetimeArray
字符串别名:'datetime64 [ns,]'
String Aliases: 'datetime64[ns, ]'
数据种类:分类
数据类型: CategoricalDtype
标量:(无)
数组:类别
字符串别名:类别"
数据种类:时间段(时间跨度)
Kind of data: period (time spans)
数据类型: PeriodDtype
标量:时段
Array: arrays.PeriodArray
字符串别名:'period []','Period []'
String Aliases: 'period[]', 'Period[]'
数据种类:稀疏
数据类型: SparseDtype
标量:(无)
Array: arrays.SparseArray
字符串别名:'Sparse','Sparse [int]','Sparse [float]'
String Aliases: 'Sparse', 'Sparse[int]', 'Sparse[float]'
数据种类:时间间隔
数据类型: IntervalDtype
标量:时间间隔
Array: arrays.IntervalArray
字符串别名:"interval","Interval","Interval []","Interval [datetime64 [ns,]]","Interval [timedelta64 []]"
String Aliases: 'interval', 'Interval', 'Interval[]', 'Interval[datetime64[ns, ]]', 'Interval[timedelta64[]]'
数据种类:可为空的整数
Kind of data: nullable integer
数据类型: Int64Dtype ,...
标量:(无)
Array: arrays.IntegerArray
字符串别名:'Int8','Int16','Int32','Int64','UInt8','UInt16','UInt32','UInt64'
String Aliases: 'Int8', 'Int16', 'Int32', 'Int64', 'UInt8', 'UInt16', 'UInt32', 'UInt64'
数据种类:字符串
数据类型: StringDtype
标量: str
Array: arrays.StringArray
字符串别名:字符串"
String Aliases: 'string'
数据种类:布尔值(不适用)
Kind of data: Boolean (with NA)
数据类型: BooleanDtype
标量:布尔
Array: arrays.BooleanArray
字符串别名:'boolean'
String Aliases: 'boolean'
这篇关于 pandas 能识别出什么类型的dtypes?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!