什么是dtype('O')? [英] What is dtype('O')?
问题描述
我在熊猫中有一个数据框,我试图弄清楚它的值的类型是什么.我不确定列'Test'
的类型.但是,当我运行myFrame['Test'].dtype
时,我得到了;
I have a dataframe in pandas and I'm trying to figure out what the types of its values are. I am unsure what the type is of column 'Test'
. However, when I run myFrame['Test'].dtype
, I get;
dtype('O')
这是什么意思?
推荐答案
当您在数据框内看到dtype('O')
时,表示熊猫字符串.
什么是dtype
?
When you see dtype('O')
inside dataframe this means Pandas string.
What is dtype
?
属于pandas
或numpy
或两者兼有的东西?如果我们检查熊猫代码:
Something that belongs to pandas
or numpy
, or both, or something else? If we examine pandas code:
df = pd.DataFrame({'float': [1.0],
'int': [1],
'datetime': [pd.Timestamp('20180310')],
'string': ['foo']})
print(df)
print(df['float'].dtype,df['int'].dtype,df['datetime'].dtype,df['string'].dtype)
df['string'].dtype
它将输出如下:
float int datetime string
0 1.0 1 2018-03-10 foo
---
float64 int64 datetime64[ns] object
---
dtype('O')
您可以将最后一个解释为Python类型的字符串Pandas dtype('O')
或Pandas对象,这对应于Numpy string_
或unicode_
类型.
You can interpret the last as Pandas dtype('O')
or Pandas object which is Python type string, and this corresponds to Numpy string_
, or unicode_
types.
Pandas dtype Python type NumPy type Usage
object str string_, unicode_ Text
就像唐吉x德(Don Quixote)在屁股上,熊猫(Pandas)在Numpy上一样,Numpy了解系统的基础架构,并使用类
Like Don Quixote is on ass, Pandas is on Numpy and Numpy understand the underlying architecture of your system and uses the class numpy.dtype
for that.
数据类型对象是numpy.dtype
类的实例,可以理解更精确的数据类型,包括:
Data type object is an instance of numpy.dtype
class that understand the data type more precise including:
- 数据类型(整数,浮点数,Python对象等)
- 数据大小(例如整数中的多少个字节)
- 数据的字节顺序(小端或大端)
- 如果数据类型是结构化的,则是其他数据类型的集合(例如,描述由整数和浮点数组成的数组项)
- 结构的字段"的名称是什么
- 每个字段的数据类型是什么
- 每个字段占用存储块的哪个部分
- 如果数据类型是子数组,它的形状和数据类型是什么
在此问题中,dtype
属于pands和numpy,尤其是dtype('O')
表示我们期望该字符串.
In the context of this question dtype
belongs to both pands and numpy and in particular dtype('O')
means we expect the string.
以下是一些测试代码,并附有说明: 如果我们将数据集作为字典
Here is some code for testing with explanation: If we have the dataset as dictionary
import pandas as pd
import numpy as np
from pandas import Timestamp
data={'id': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5}, 'date': {0: Timestamp('2018-12-12 00:00:00'), 1: Timestamp('2018-12-12 00:00:00'), 2: Timestamp('2018-12-12 00:00:00'), 3: Timestamp('2018-12-12 00:00:00'), 4: Timestamp('2018-12-12 00:00:00')}, 'role': {0: 'Support', 1: 'Marketing', 2: 'Business Development', 3: 'Sales', 4: 'Engineering'}, 'num': {0: 123, 1: 234, 2: 345, 3: 456, 4: 567}, 'fnum': {0: 3.14, 1: 2.14, 2: -0.14, 3: 41.3, 4: 3.14}}
df = pd.DataFrame.from_dict(data) #now we have a dataframe
print(df)
print(df.dtypes)
最后一行将检查数据框并记录输出:
The last lines will examine the dataframe and note the output:
id date role num fnum
0 1 2018-12-12 Support 123 3.14
1 2 2018-12-12 Marketing 234 2.14
2 3 2018-12-12 Business Development 345 -0.14
3 4 2018-12-12 Sales 456 41.30
4 5 2018-12-12 Engineering 567 3.14
id int64
date datetime64[ns]
role object
num int64
fnum float64
dtype: object
各种各样的dtypes
df.iloc[1,:] = np.nan
df.iloc[2,:] = None
但是,如果我们尝试设置np.nan
或None
,则不会影响原始列dtype.输出将是这样的:
But if we try to set np.nan
or None
this will not affect the original column dtype. The output will be like this:
print(df)
print(df.dtypes)
id date role num fnum
0 1.0 2018-12-12 Support 123.0 3.14
1 NaN NaT NaN NaN NaN
2 NaN NaT None NaN NaN
3 4.0 2018-12-12 Sales 456.0 41.30
4 5.0 2018-12-12 Engineering 567.0 3.14
id float64
date datetime64[ns]
role object
num float64
fnum float64
dtype: object
因此,除非我们将所有列行都设置为np.nan
或None
,否则np.nan
或None
不会更改列dtype
.在这种情况下,列将分别变为float64
或object
.
So np.nan
or None
will not change the columns dtype
, unless we set the all column rows to np.nan
or None
. In that case column will become float64
or object
respectively.
您也可以尝试设置单行:
You may try also setting single rows:
df.iloc[3,:] = 0 # will convert datetime to object only
df.iloc[4,:] = '' # will convert all columns to object
在这里需要注意的是,如果我们在非字符串列中设置字符串,它将成为字符串或对象dtype
.
And to note here, if we set string inside a non string column it will become string or object dtype
.
这篇关于什么是dtype('O')?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!