pandas.read_html不支持十进制逗号 [英] pandas.read_html not support decimal comma

查看：108 发布时间：2020/10/19 18:54:11 python pandas decimal xlm

本文介绍了pandas.read_html不支持十进制逗号的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用 pandas.read_html 读取xlm文件，并且工作得几乎完美，问题在于该文件使用逗号作为小数点分隔符而不是点（ read_html ）。

I was reading an xlm file using pandas.read_html and works almost perfect, the problem is that the file has commas as decimal separators instead of dots (the default in read_html).

我可以很容易地用一个文件中的点替换逗号，但使用该配置的文件将近200个。
和 pandas.read_csv 可以定义小数点分隔符，但是我不知道为什么在 pandas.read_html 您只能定义千位分隔符。

I could easily replace the commas by dots in one file, but i have almost 200 files with that configuration. with pandas.read_csv you can define the decimal separator, but i don't know why in pandas.read_html you can only define the thousand separator.

在这件事上有什么指导吗？还有另一种方法可以在熊猫打开前自动进行逗号/点替换？
预先感谢！

any guidance in this matter?, there is another way to automate the comma/dot replacement before it is open by pandas? thanks in advance!

推荐答案

感谢@zhqiat。我认为将 pandas 升级到版本 0.19 将解决此问题。不幸的是，我找不到简单的方法来实现这一目标。我找到了升级Pandas的教程，但针对 ubuntu （winXP用户）。

Thanks @zhqiat. I think upgrading pandas to version 0.19 will solve the problem. unfortunately I couldn't found an easy way to accomplish that. I found a tutorial to upgrade Pandas but for ubuntu (winXP user).

我终于使用此处，基本上将所有列逐一转换为数字类型 pandas.Series

I finally chose the workaround, using the method posted here, basically converting all columns, one by one, to a numeric type of pandas.Series

result[col] = result[col].apply(lambda x: x.str.replace(".","").str.replace(",","."))

我知道这种解决方案不是最好的，但是可以解决。谢谢

I know that this solution ain't the best, but works. Thanks

这篇关于pandas.read_html不支持十进制逗号的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas.read_html不支持十进制逗号 [英] pandas.read_html not support decimal comma

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas.read_html不支持十进制逗号 [英] pandas.read_html not support decimal comma

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭