pandas DataFrame-查找列的索引值 [英] Pandas DataFrame- Finding Index Value for a Column
问题描述
我的文件路径打开他们
mc = pd.read_csv(C:\\\data.csv,sep =,,header = 0,dtype = str)
当我检查我的列值时,使用
mc.coulumns.values
我发现我的ID有一个奇怪的角色看起来像这样,
['/ ufeffID','Name','Specification','Time']
此后,我分配了这样一个ID的列,
mc.columns.values [0] =ID
当我使用
mc.columns.values
我得到我的结果,
Array([' ID','Name','Specification','Time'])
然后, ,
mc.columns.values中的ID
它给了我True
然后我尝试,
mc [ID]
我收到一个错误,这样的说法,
keyError'ID'。
我想获取ID列的值,并摆脱ID列前面的奇怪字符?有什么办法可以解决吗?任何帮助将不胜感激。先谢谢你。
这是utf-16 BOM,通过 encoding ='utf-16'
至 read_csv
请参阅: https:// en.wikipedia.org/wiki/Byte_order_mark#Representations_of_byte_order_marks_by_encoding
mc = pd.read_csv(C:\ \data.csv,sep =,,header = 0,dtype = str,encoding ='utf-16')
上述应该工作 FE FF
是utf-16的BOM B Big endian要具体
此外,您应该使用重命名
而不是尝试覆盖np数组值:
mc.rename(columns = {mc.columns [0]:ID},inplace = True)
应该正常工作
I have a DataFrame that has columns such as ID, Name, Specification, Time.
my file path to open them
mc = pd.read_csv("C:\\data.csv", sep = ",", header = 0, dtype = str)
When I checked my columns values, using
mc.coulumns.values
I found my ID had it with a weird character looked like this,
['/ufeffID', 'Name', 'Specification', 'Time']
After this I assigned that columns with ID like this,
mc.columns.values[0] = "ID"
When I checked this using
mc.columns.values
I got my result as,
Array(['ID', 'Name', 'Specification', 'Time'])
Then, I checked with,
"ID" in mc.columns.values
it gave me "True"
Then I tried,
mc["ID"]
I got an error stating like this,
keyError 'ID'.
I want to get the values of ID column and get rid of that weird characters in front of ID column? Is there any way to solve that? Any help would be appreciated. Thank you in advance.
That's utf-16 BOM, pass encoding='utf-16'
to read_csv
see: https://en.wikipedia.org/wiki/Byte_order_mark#Representations_of_byte_order_marks_by_encoding
mc = pd.read_csv("C:\\data.csv", sep=",", header=0, dtype=str, encoding='utf-16')
the above should work FE FF
is the BOM for utf-16 Big endian to be specific
Also you should use rename
rather than try to overwrite the np array value:
mc.rename(columns={mc.columns[0]: "ID"}, inplace=True)
should work correctly
这篇关于 pandas DataFrame-查找列的索引值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!