将 Pandas DataFrame 中带逗号的数字字符串转换为浮点数 [英] Convert number strings with commas in pandas DataFrame to float

查看:103
本文介绍了将 Pandas DataFrame 中带逗号的数字字符串转换为浮点数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 DataFrame,它包含数字作为字符串,千位标记用逗号表示.我需要将它们转换为浮点数.

a = [['1,200', '4,200'], ['7,000', '-0.03'], [ '5', '0']]df=pandas.DataFrame(a)

我猜我需要使用 locale.atof.确实

df[0].apply(locale.atof)

按预期工作.我得到了一系列的花车.

但是当我将其应用于 DataFrame 时,出现错误.

df.apply(locale.atof)

<块引用>

TypeError: ("cannot convert the series to ", u'occurred at index 0')

df[0:1].apply(locale.atof)

给出另一个错误:

<块引用>

ValueError: ('invalid literal for float(): 1,200', u'occurred at index 0')

那么,我如何将这个字符串的 DataFrame 转换为浮点数的 DataFrame?

解决方案

如果你是 从 csv 读取,然后您可以使用 数千个参数:

df.read_csv('foo.tsv', sep='	',数千=',')

这种方法可能比将操作作为单独的步骤执行更有效.

<小时>

您需要先设置语言环境:

In [ 9]: 导入语言环境在 [10] 中:从语言环境导入 atof在 [11]: locale.setlocale(locale.LC_NUMERIC, '')输出 [11]: 'en_GB.UTF-8'在 [12]: df.applymap(atof)出[12]:0 10 1200 4200.001 7000 -0.032 5 0.00

I have a DataFrame that contains numbers as strings with commas for the thousands marker. I need to convert them to floats.

a = [['1,200', '4,200'], ['7,000', '-0.03'], [ '5', '0']]
df=pandas.DataFrame(a)

I am guessing I need to use locale.atof. Indeed

df[0].apply(locale.atof)

works as expected. I get a Series of floats.

But when I apply it to the DataFrame, I get an error.

df.apply(locale.atof)

TypeError: ("cannot convert the series to ", u'occurred at index 0')

and

df[0:1].apply(locale.atof)

gives another error:

ValueError: ('invalid literal for float(): 1,200', u'occurred at index 0')

So, how do I convert this DataFrame of strings to a DataFrame of floats?

解决方案

If you're reading in from csv then you can use the thousands arg:

df.read_csv('foo.tsv', sep='	', thousands=',')

This method is likely to be more efficient than performing the operation as a separate step.


You need to set the locale first:

In [ 9]: import locale

In [10]: from locale import atof

In [11]: locale.setlocale(locale.LC_NUMERIC, '')
Out[11]: 'en_GB.UTF-8'

In [12]: df.applymap(atof)
Out[12]:
      0        1
0  1200  4200.00
1  7000    -0.03
2     5     0.00

这篇关于将 Pandas DataFrame 中带逗号的数字字符串转换为浮点数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆