pandas DataFrame-查找列的索引值 [英] Pandas DataFrame- Finding Index Value for a Column

查看:768
本文介绍了 pandas DataFrame-查找列的索引值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我的文件路径打开他们

  mc = pd.read_csv(C:\\\data.csv,sep =,,header = 0,dtype = str)

当我检查我的列值时,使用

  mc.coulumns.values 

我发现我的ID有一个奇怪的角色看起来像这样,

  ['/ ufeffID','Name','Specification','Time'] 

此后,我分配了这样一个ID的列,

  mc.columns.values [0] =ID

当我使用

  mc.columns.values 

我得到我的结果,

  Array([' ID','Name','Specification','Time'])

然后, ,

  mc.columns.values中的ID

它给了我True



然后我尝试,

  mc [ID] 

我收到一个错误,这样的说法,

  keyError'ID'。 

我想获取ID列的值,并摆脱ID列前面的奇怪字符?有什么办法可以解决吗?任何帮助将不胜感激。先谢谢你。

解决方案

这是utf-16 BOM,通过 encoding ='utf-16' read_csv 请参阅: https:// en.wikipedia.org/wiki/Byte_order_mark#Representations_of_byte_order_marks_by_encoding

  mc = pd.read_csv(C:\ \data.csv,sep =,,header = 0,dtype = str,encoding ='utf-16')

上述应该工作 FE FF 是utf-16的BOM B Big endian要具体



此外,您应该使用重命名而不是尝试覆盖np数组值:

  mc.rename(columns = {mc.columns [0]:ID},inplace = True)

应该正常工作


I have a DataFrame that has columns such as ID, Name, Specification, Time.

my file path to open them

mc = pd.read_csv("C:\\data.csv", sep = ",", header = 0, dtype = str)

When I checked my columns values, using

mc.coulumns.values 

I found my ID had it with a weird character looked like this,

['/ufeffID', 'Name', 'Specification', 'Time']  

After this I assigned that columns with ID like this,

 mc.columns.values[0] = "ID"

When I checked this using

mc.columns.values 

I got my result as,

Array(['ID', 'Name', 'Specification', 'Time']) 

Then, I checked with,

"ID" in mc.columns.values

it gave me "True"

Then I tried,

mc["ID"]

I got an error stating like this,

keyError 'ID'.

I want to get the values of ID column and get rid of that weird characters in front of ID column? Is there any way to solve that? Any help would be appreciated. Thank you in advance.

解决方案

That's utf-16 BOM, pass encoding='utf-16' to read_csv see: https://en.wikipedia.org/wiki/Byte_order_mark#Representations_of_byte_order_marks_by_encoding

mc = pd.read_csv("C:\\data.csv", sep=",", header=0, dtype=str, encoding='utf-16')

the above should work FE FF is the BOM for utf-16 Big endian to be specific

Also you should use rename rather than try to overwrite the np array value:

mc.rename(columns={mc.columns[0]: "ID"}, inplace=True)

should work correctly

这篇关于 pandas DataFrame-查找列的索引值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆