获取pandas.read_csv以将空值读取为空字符串而不是nan [英] Get pandas.read_csv to read empty values as empty string instead of nan
问题描述
我正在使用pandas库读取一些CSV数据.在我的数据中,某些列包含字符串.字符串"nan"
是可能的值,空字符串也是.我设法让大熊猫将"nan"读取为字符串,但我不知道如何获取不读取空值作为NaN的方法.这是示例数据和输出
I'm using the pandas library to read in some CSV data. In my data, certain columns contain strings. The string "nan"
is a possible value, as is an empty string. I managed to get pandas to read "nan" as a string, but I can't figure out how to get it not to read an empty value as NaN. Here's sample data and output
One,Two,Three
a,1,one
b,2,two
,3,three
d,4,nan
e,5,five
nan,6,
g,7,seven
>>> pandas.read_csv('test.csv', na_values={'One': [], "Three": []})
One Two Three
0 a 1 one
1 b 2 two
2 NaN 3 three
3 d 4 nan
4 e 5 five
5 nan 6 NaN
6 g 7 seven
它正确地将"nan"读取为字符串"nan",但仍将空单元格读取为NaN.我尝试将converters
参数中的str
传递给read_csv(使用converters={'One': str})
),但是它仍然将空单元格读取为NaN.
It correctly reads "nan" as the string "nan', but still reads the empty cells as NaN. I tried passing in str
in the converters
argument to read_csv (with converters={'One': str})
), but it still reads the empty cells as NaN.
我意识到我可以用fillna填充读取后的值,但是真的没有办法告诉熊猫特定CSV列中的空单元格应读取为空字符串而不是NaN吗?
I realize I can fill the values after reading, with fillna, but is there really no way to tell pandas that an empty cell in a particular CSV column should be read as an empty string instead of NaN?
推荐答案
我添加了一个票证,以便在此处添加某种选项:
I added a ticket to add an option of some sort here:
https://github.com/pydata/pandas/issues/1450
与此同时,result.fillna('')
应该做你想做的事
In the meantime, result.fillna('')
should do what you want
在开发版本(最终为0.8.0)中,如果您指定na_values
的空列表,则空字符串将在结果中保留空字符串
in the development version (to be 0.8.0 final) if you specify an empty list of na_values
, empty strings will stay empty strings in the result
这篇关于获取pandas.read_csv以将空值读取为空字符串而不是nan的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!