确定 pandas 列数据类型 [英] Determining Pandas Column DataType

查看:77
本文介绍了确定 pandas 列数据类型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有时,当数据导入到Pandas Dataframe时,它始终以object类型导入.对于大多数操作来说这很好并且很好,但是我正在尝试创建一个自定义导出功能,我的问题是:

Sometimes when data is imported to Pandas Dataframe, it always imports as type object. This is fine and well for doing most operations, but I am trying to create a custom export function, and my question is this:

  • 有没有办法强迫熊猫推断输入数据的数据类型?
  • 如果不是,那么在加载数据后是否有办法以某种方式推断数据类型?

我知道我可以告诉Pandas这是int,str等类型的.但是我不想这样做,我希望当用户导入或输入数据时,pandas能够足够聪明地知道所有数据类型.添加一列.

I know I can tell Pandas that this is of type int, str, etc.. but I don't want to do that, I was hoping pandas could be smart enough to know all the data types when a user imports or adds a column.

编辑-导入示例

a = ['a']
col = ['somename']
df = pd.DataFrame(a, columns=col)
print(df.dtypes)
>>> somename    object
dtype: object

类型应该是字符串吗?

The type should be string?

推荐答案

这只是部分答案,但是您可以在整个DataFrame上获取变量中元素数据类型的频率计数,如下所示:

This is only a partial answer, but you can get frequency counts of the data type of the elements in a variable over the entire DataFrame as follows:

dtypeCount =[df.iloc[:,i].apply(type).value_counts() for i in range(df.shape[1])]

这将返回

dtypeCount

[<class 'numpy.int32'>    4
 Name: a, dtype: int64,
 <class 'int'>    2
 <class 'str'>    2
 Name: b, dtype: int64,
 <class 'numpy.int32'>    4
 Name: c, dtype: int64]

打印效果不佳,但是您可以按位置提取任何变量的信息:

It doesn't print this nicely, but you can pull out information for any variable by location:

dtypeCount[1]

<class 'int'>    2
<class 'str'>    2
Name: b, dtype: int64

这应该让您开始寻找导致问题的数据类型以及其中的数量.

which should get you started in finding what data types are causing the issue and how many of them there are.

然后您可以使用

df[df.iloc[:,1].map(lambda x: type(x) == str)]

   a  b  c
1  1  n  4
3  3  g  6

数据

df = DataFrame({'a': range(4),
                'b': [6, 'n', 7, 'g'],
                'c': range(3, 7)})

这篇关于确定 pandas 列数据类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆