CSV模块返回第一列的BOM [英] csv module returning a BOM for first column
问题描述
我有一个csv文件,格式如下:
I have a csv file formatted like this:
type,type_mapping, style,style_mapping,Count
Residential,Residential,Antique,Antique,109
Antique,Residential,Antique,Antique,48
Apt/Garage,Commercial,Apt/Garage,Apartment,1
我正在使用Python(版本3)中的csv模块进行解析.这是我的代码:
I am parsing it using the csv module in Python (version 3). Here is my code:
import os
import csv
typeXref = dict()
with open('xref.csv') as csvData:
csvRead = csv.reader(csvData)
headers = next(csvRead)
for index, row in enumerate(csvRead):
typeXref[index] = {key: value for key, value in zip(headers, row)}
print(typeXref)
由于某种原因,我的第一列连续返回标题中第一列的字节顺序标记 \ ufeff
.
For some reason my first column continually returns the byte order mark \ufeff
for the first column in the header.
408: {'\ufefftype': 'Residential', 'type_mapping': 'Residential',
' style': 'Antique', 'style_mapping': 'Antique', 'Count': '109'}}
我认为这是由于我打开文件,使用 csv
模块读取内容或生成文件的方式引起的.
I assume this is due to the way I'm opening the file, reading the content with the csv
module, or generating the file.
我可以弄清楚如何对该字段进行解码,但是可以确保我正确生成了文件,或者使用了 csv
模块属性.
I can figure out how to decode that one field, but would rather ensure I'm generating the file correctly, or using the csv
module property.
推荐答案
您必须告诉您正在使用BOM读取utf-8文件:
You have to tell that you are reading an utf-8 file with BOM:
with open('xref.csv', encoding='utf-8-sig') as csvData:
....
然后将BOM剥离
这篇关于CSV模块返回第一列的BOM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!