pandas 逃生马车返回to_csv [英] Pandas escape carriage return in to_csv

查看:67
本文介绍了 pandas 逃生马车返回to_csv的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串列,有时在字符串中包含回车符:

I have a string column that sometimes has carriage returns in the string:

import pandas as pd
from io import StringIO

datastring = StringIO("""\
country  metric           2011   2012
USA      GDP              7      4
USA      Pop.             2      3
GB       GDP              8      7
""")
df = pd.read_table(datastring, sep='\s\s+')
df.metric = df.metric + '\r'  # append carriage return

print(df)
  country  metric  2011  2012
0     USA   GDP\r     7     4
1     USA  Pop.\r     2     3
2      GB   GDP\r     8     7

在写入和读取csv时,数据帧损坏:

When writing to and reading from csv, the dataframe gets corrupted:

df.to_csv('data.csv', index=None)

print(pd.read_csv('data.csv'))
  country metric  2011  2012
0     USA    GDP   NaN   NaN
1     NaN      7     4   NaN
2     USA   Pop.   NaN   NaN
3     NaN      2     3   NaN
4      GB    GDP   NaN   NaN
5     NaN      8     7   NaN

问题

解决此问题的最佳方法是什么?一种明显的方法是先清除数据:

Question

What's the best way to fix this? The one obvious method is to just clean the data first:

df.metric = df.metric.str.replace('\r', '')

推荐答案

指定line_terminator:

print(pd.read_csv('data.csv', line_terminator='\n'))

  country  metric  2011  2012
0     USA   GDP\r     7     4
1     USA  Pop.\r     2     3
2      GB   GDP\r     8     7

更新:

在较新的熊猫版本中(原始答案是从2015年开始),自变量的名称更改为lineterminator.

In more recent versions of pandas (the original answer is from 2015) the name of the argument changed to lineterminator.

这篇关于 pandas 逃生马车返回to_csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆