UnicodeDecodeError:“字符映射"编解码器无法解码位置7240中的字节0x8d:字符映射为< undefined> [英] UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 7240: character maps to <undefined>
问题描述
我是学生在做我的硕士论文.作为论文的一部分,我正在使用 python .我正在读取.csv
格式的日志文件,并将提取的数据以格式正确的方式写入另一个.csv
文件.但是,读取文件时,出现此错误:
I am student doing my master thesis. As part of my thesis, I am working with python. I am reading a log file of .csv
format and writing the extracted data to another .csv
file in a well formatted way. However, when the file is read, I am getting this error:
回溯(最近通话最近):文件 "C:\ Users \ SGADI \ workspace \ DAB_Trace \ my_code \ trace_parcer.py",第19行, 在阅读器中排成一行:
Traceback (most recent call last): File "C:\Users\SGADI\workspace\DAB_Trace\my_code\trace_parcer.py", line 19, in for row in reader:
- 文件"C:\ Users \ SGADI \ Desktop \ Python-32bit-3.4.3.2 \ python-3.4.3 \ lib \ encodings \ cp1252.py",
第23行,在解码中返回
codecs.charmap_decode(input,self.errors,decoding_table)[0]
- UnicodeDecodeError:'charmap'编解码器无法解码位置7240中的字节0x8d:字符映射到
<undefined>
- File "C:\Users\SGADI\Desktop\Python-32bit-3.4.3.2\python-3.4.3\lib\encodings\cp1252.py",
line 23, in decode return
codecs.charmap_decode(input,self.errors,decoding_table)[0]
- UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 7240: character maps to
<undefined>
import csv
import re
#import matplotlib
#import matplotlib.pyplot as plt
import datetime
#import pandas
#from dateutil.parser import parse
#def parse_csv_file():
timestamp = datetime.datetime.strptime('00:00:00.000', '%H:%M:%S.%f')
timestamp_list = []
snr_list = []
freq_list = []
rssi_list = []
dab_present_list = []
counter = 0
f = open("output.txt","w")
with open('test_log_20150325_gps.csv') as csvfile:
reader = csv.reader(csvfile, delimiter=';')
for row in reader:
#timestamp = datetime.datetime.strptime(row[0], '%M:%S.%f')
#timestamp.split(" ",1)
timestamp = row[0]
timestamp_list.append(timestamp)
#timestamp = row[0]
details = row[-1]
counter += 1
print (counter)
#if(counter > 25000):
# break
#timestamp = datetime.datetime.strptime(row[0], '%M:%S.%f')
#timestamp_list.append(float(timestamp))
#search for SNRLevel=\d+
snr = re.findall('SNRLevel=(\d+)', details)
if snr == []:
snr = 0
else:
snr = snr[0]
snr_list.append(int(snr))
#search for Frequency=09ABC
freq = re.findall('Frequency=([0-9a-fA-F]+)', details)
if freq == []:
freq = 0
else:
freq = int(freq[0], 16)
freq_list.append(int(freq))
#search for RSSI=\d+
rssi = re.findall('RSSI=(\d+)', details)
if rssi == []:
rssi = 0
else:
rssi = rssi[0]
rssi_list.append(int(rssi))
#search for DABSignalPresent=\d+
dab_present = re.findall('DABSignalPresent=(\d+)', details)
if dab_present== []:
dab_present = 0
else:
dab_present = dab_present[0]
dab_present_list.append(int(dab_present))
f.write(str(timestamp) + "\t")
f.write(str(freq) + "\t")
f.write(str(snr) + "\t")
f.write(str(rssi) + "\t")
f.write(str(dab_present) + "\n")
print (timestamp, freq, snr, rssi, dab_present)
#print (index+1)
#print(timestamp,freq,snr)
#print (counter)
#print(timestamp_list,freq_list,snr_list,rssi_list)
'''if snr != []:
if freq != []:
timestamp_list.append(timestamp)
snr_list.append(snr)
freq_list.append(freq)
f.write(str(timestamp_list) + "\t")
f.write(str(freq_list) + "\t")
f.write(str(snr_list) + "\n")
print(timestamp_list,freq_list,snr_list)'''
f.close()
我搜索了特殊字符,但没有找到任何特殊字符.我搜索了建议更改格式的Internet:我尝试了ut8,latin1和其他几种格式,但是仍然出现此错误.您能否也请帮助我如何使用pandas
解决.我也尝试使用pandas
,但仍然出现错误.
我什至在日志文件中删除了一行,但该错误发生在下一行.
I searched for the special character and I did not find any. I searched the Internet which suggested to change the format: I tried ut8, latin1 and few other formats, but i am still getting this error. Can you please help me how to solve with pandas
as well. I also tried with pandas
but I am still getting the error.
I even removed a line in the log file, but the error occurs in the next line.
请帮助我找到解决方案,谢谢.
Please help me finding a solution, thank you.
推荐答案
我已经解决了这个问题. 我们可以使用此代码
i have solved this issue. we can use this code
import codecs
types_of_encoding = ["utf8", "cp1252"]
for encoding_type in types_of_encoding:
with codecs.open(filename, encoding = encoding_type, errors ='replace') as csvfile:
your code
....
....
这篇关于UnicodeDecodeError:“字符映射"编解码器无法解码位置7240中的字节0x8d:字符映射为< undefined>的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!