将pandas dataframe列导入为字符串而不是int [英] Import pandas dataframe column as string not int
本文介绍了将pandas dataframe列导入为字符串而不是int的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想将以下csv作为字符串而不是int64导入.熊猫read_csv自动将其转换为int64,但我需要将此列作为字符串.
I would like to import the following csv as strings not as int64. Pandas read_csv automatically converts it to int64, but I need this column as string.
ID
00013007854817840016671868
00013007854817840016749251
00013007854817840016754630
00013007854817840016781876
00013007854817840017028824
00013007854817840017963235
00013007854817840018860166
df = read_csv('sample.csv')
df.ID
>>
0 -9223372036854775808
1 -9223372036854775808
2 -9223372036854775808
3 -9223372036854775808
4 -9223372036854775808
5 -9223372036854775808
6 -9223372036854775808
Name: ID
不幸的是,使用转换器会得到相同的结果.
Unfortunately using converters gives the same result.
df = read_csv('sample.csv', converters={'ID': str})
df.ID
>>
0 -9223372036854775808
1 -9223372036854775808
2 -9223372036854775808
3 -9223372036854775808
4 -9223372036854775808
5 -9223372036854775808
6 -9223372036854775808
Name: ID
推荐答案
只想重申一遍,这将适用于> = 0.9.1的熊猫:
Just want to reiterate this will work in pandas >= 0.9.1:
In [2]: read_csv('sample.csv', dtype={'ID': object})
Out[2]:
ID
0 00013007854817840016671868
1 00013007854817840016749251
2 00013007854817840016754630
3 00013007854817840016781876
4 00013007854817840017028824
5 00013007854817840017963235
6 00013007854817840018860166
我也在创建一个有关检测整数溢出的问题.
I'm creating an issue about detecting integer overflows also.
在此处查看分辨率: https://github.com/pydata/pandas/issues/2247
这篇关于将pandas dataframe列导入为字符串而不是int的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文