默认情况下，pd.read_csv将整数视为浮点型 [英] pd.read_csv by default treats integers like floats

查看：523 发布时间：2020/5/24 1:50:53 python csv pandas integer

本文介绍了默认情况下，pd.read_csv将整数视为浮点型的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个csv看起来像(标题=第一行):

I have a csv that looks like (headers = first row):

name,a,a1,b,b1
arnold,300311,arnld01,300311,arnld01
sam,300713,sam01,300713,sam01

当我跑步时:

df = pd.read_csv('file.csv')

列a和b的末尾带有.0，如下所示:

Columns a and b have a .0 attached to the end like so:

df.head()

name,a,a1,b,b1
arnold,300311.0,arnld01,300311.0,arnld01
sam,300713.0,sam01,300713.0,sam01

列a和b是整数或空格，那么为什么pd.read_csv()像对待浮点数一样对待它们，以及如何确保它们在读取时是整数?

Columns a and b are integers or blanks so why does pd.read_csv() treat them like floats and how do I ensure they are integers on the read?

删除NaN行

您的第一个选择是删除包含此NaN值的行.不利的一面是您输掉了整行.将数据放入数据框后，运行以下命令:

Drop NaN rows

Your first option is to drop rows that contain this NaN value. The downside of this, is that you lose the entire row. After getting your data into a dataframe, run this:

df.dropna(inplace=True)
df.a = df.a.astype(int)
df.b = df.b.astype(int)

这将从数据框中删除所有NaN行，然后将列a和列b转换为int

This drops all NaN rows from the dataframe, then it converts column a and column b to an int

>>> df.dtypes
name    object
a        int32
a1      object
b        int32
b1      object
dtype: object

>>> df
     name       a       a1       b       b1
0  arnold  300311  arnld01  300311  arnld01
1     sam  300713    sam01  300713    sam01

用占位符数据填充`NaN`

此选项将用扔掉值替换所有NaN值.该值是您需要确定的.对于此测试，我将其设置为-999999.这将允许使用它来保留其余数据，将其转换为int，并使其清楚哪些数据无效.如果您稍后要根据各列进行计算，则可以将这些行过滤掉.

Fill `NaN` with placeholder data

This option will replace all your NaN values with a throw away value. That value is something you need to determine. For this test, I made it -999999. This will allow use to keep the rest of the data, convert it to an int, and make it obvious what data is invalid. You'll be able to filter these rows out if you are making calculations based on the columns later.

df.fillna(-999999, inplace=True)
df.a = df.a.astype(int)
df.b = df.b.astype(int)

这将产生一个数据帧，如下所示:

This produces a dataframe like so:

>>> df.dtypes
name    object
a        int32
a1      object
b        int32
b1      object
dtype: object

>>> df
     name       a       a1       b       b1
0  arnold  300311  arnld01  300311  arnld01
1     sam  300713    sam01  300713    sam01
2    test -999999   test01 -999999   test01

保留浮点值

最后，另一种选择是保留浮点值(和NaN)，而不用担心非整数数据类型.

Leave the float values

Finally, another choice is to leave the float values (and NaN) and not worry about the non-integer data type.

这篇关于默认情况下，pd.read_csv将整数视为浮点型的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

默认情况下，pd.read_csv将整数视为浮点型 [英] pd.read_csv by default treats integers like floats

问题描述

推荐答案

删除NaN行

Drop NaN rows

用占位符数据填充`NaN`

Fill `NaN` with placeholder data

保留浮点值

Leave the float values

相关文章

Python最新文章

热门教程

热门工具

登录关闭

默认情况下，pd.read_csv将整数视为浮点型 [英] pd.read_csv by default treats integers like floats

问题描述

推荐答案

删除NaN行

Drop NaN rows

用占位符数据填充NaN

Fill NaN with placeholder data

保留浮点值

Leave the float values

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

用占位符数据填充`NaN`

Fill `NaN` with placeholder data

登录关闭