将pandas DataFrame中带逗号的数字字符串转换为float [英] Convert number strings with commas in pandas DataFrame to float
问题描述
我有一个DataFrame,其中包含数字作为字符串,并带有千位标记的逗号.我需要将它们转换为浮点数.
I have a DataFrame that contains numbers as strings with commas for the thousands marker. I need to convert them to floats.
a = [['1,200', '4,200'], ['7,000', '-0.03'], [ '5', '0']]
df=pandas.DataFrame(a)
我猜我需要使用locale.atof.确实
I am guessing I need to use locale.atof. Indeed
df[0].apply(locale.atof)
按预期工作.我有一系列的花车.
works as expected. I get a Series of floats.
但是当我将其应用于DataFrame时,会出现错误.
But when I apply it to the DataFrame, I get an error.
df.apply(locale.atof)
TypeError :(无法将系列转换为,您在索引0处发生了")
TypeError: ("cannot convert the series to ", u'occurred at index 0')
和
df[0:1].apply(locale.atof)
给出另一个错误:
ValueError:('float()的无效文字:1,200',u'发生在索引0')
ValueError: ('invalid literal for float(): 1,200', u'occurred at index 0')
那么,如何将字符串的DataFrame
转换为浮点数的DataFrame?
So, how do I convert this DataFrame
of strings to a DataFrame of floats?
推荐答案
如果您数千arg :
If you're reading in from csv then you can use the thousands arg:
df.read_csv('foo.tsv', sep='\t', thousands=',')
与单独执行该操作相比,此方法可能更有效.
This method is likely to be more efficient than performing the operation as a separate step.
您需要先设置语言环境:
In [ 9]: import locale
In [10]: from locale import atof
In [11]: locale.setlocale(locale.LC_NUMERIC, '')
Out[11]: 'en_GB.UTF-8'
In [12]: df.applymap(atof)
Out[12]:
0 1
0 1200 4200.00
1 7000 -0.03
2 5 0.00
这篇关于将pandas DataFrame中带逗号的数字字符串转换为float的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!