在 pandas 中添加行终止符最终会添加另一个\ r [英] Adding a line-terminator in pandas ends up adding another \r

查看:100
本文介绍了在 pandas 中添加行终止符最终会添加另一个\ r的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以将熊猫默认设置的csv文件加载到熊猫数据框中:

I am able to load a csv file fine into a pandas dataframe with the panda defaults:

df = pd.read_csv(file)

>>> df
   distance  recession_velocity
0   # not a row                 NaN
1         0.032               170.0
2         0.034               290.0
3         0.214              -130.0

但是,一旦我添加了lineterminator,该程序似乎就一团糟:

However, as soon as I add the lineterminator, the program seems to go haywire:

df = pd.read_csv(file, lineterminator='\n')
       distance recession_velocity\r
0   # not a row                   \r
1         0.032                170\r
2         0.034                290\r
3         0.214               -130\r

该文件确实具有\n行分隔符:

The file indeed does have a \n line separator:

>>> print(repr(open('/Users/david/example.csv').read()))
'distance,recession_velocity\n# not a row,\n0.032,170\n0.034,290\n0.214,-130\n0.263,

这里的问题是什么,有没有办法解决此问题而不必修剪所有列值?

What is the issue here and is there a way to fix it without having to trim all the column values?

推荐答案

Python的文件对象将以文本模式自动将\r\n转换为\n. read_csv使用其自己的文件处理,它的确会看到\r\n,因此,如果您传递lineterminator="\n",它实际上只会修剪一个字符.

Python's file objects will automatically translate \r\n to \n in text mode. read_csv uses its own file handling, it will indeed see \r\n instead, so if you pass lineterminator="\n" it will really just trim that one character.

如果根本不传递lineterminator参数,它将猜测行尾样式.您也可以传入文件对象而不是路径.这可能会使速度变慢,但会为您提供与直接阅读时相同的转换行为.

If you don't pass the lineterminator parameter at all, it will guess the line-ending style. You can also pass in a file object instead of a path. This may slow things down a bit, but it will give you the same transformation behaviour that you see when you do a straight read.

这篇关于在 pandas 中添加行终止符最终会添加另一个\ r的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆