UnicodeDecodeError:'charmap'编码解码器不能在位置Y处编码字符X:字符映射到未定义 [英] UnicodeDecodeError: 'charmap' codec can't encode character X at position Y: character maps to undefined

查看:160
本文介绍了UnicodeDecodeError:'charmap'编码解码器不能在位置Y处编码字符X:字符映射到未定义的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

要澄清:此问题与这一个不重复我已经尝试了所有的提示,没有得到答案。



我有一个带有unicode数据的txt文件,我想以字符串的形式打开文件。我试过

  a = open('myfile.txt' ,'r',encoding ='utf-8')
print a.read()

但是有一个错误说:


UnicodeDecodeError:'charmap'编解码器不能将字符'\\\'编码为
位置Y:字符映射到未定义


现在我的问题是,我根本不在乎我的UTF-8字符,有没有一个例外,每当python遇到utf-8字符只是删除它或传递它?另外要澄清一点,我尝试过使用utf-8,utf-8-sig,utf-16等编码。



尝试这样做,但没有运气。

  a = open('myfile.txt','r',encoding ='utf -8')
try:
print a.read()
除了:
pass

我还尝试导入编解码器和代码如下:

  a = codecs.open 'myfile.txt','r',encoding ='utf-8')
打印a.read()

但仍然出现同样的错误。

解决方案

修正 print 语句中的编码答案:
避免打印到 stdout Windows,因为Python假定CMD终端只能处理Windows-1252(拉丁语1的ISO的MS副本)。通常始终打印到 stderr 而不是:

  import sys 
print('your text',file = sys.stderr)

在Linux上应该对于Python 2.x,不要打印Unicode。



PS:对于Python 2.x:



$ _ code从__future__导入print_function
导入sys
打印('你的文本',文件= sys.stderr)

PPS:
原始答案
对于python 3.x :

  a = open('myfile.txt','r',encoding ='utf-8',errors =忽略')

请参阅 https://docs.python.org/3/library/codecs.html#error-handlers ,以获取您的选项的详细列表


To CLARIFY: this question is not a duplicate of this one, I have already tried all the hints there and didn't get the answer.

I have a txt file with unicode data in, and am want to open the file as an string.

I tried

a=open('myfile.txt', 'r', encoding='utf-8') 
print a.read()

but there is an error saying:

UnicodeDecodeError: 'charmap' codec can't encode character '\ufeff' at position Y: character maps to undefined

Now my question is, I don't care about my UTF-8 characters at all, is there anyway to put an exception that whenever python is encountering utf-8 character just remove it or pass it? Also to clarify, I have tried the encoding with, utf-8, utf-8-sig, utf-16 and etc.

I tried this as well but no luck.

a=open('myfile.txt', 'r', encoding='utf-8') 
try:
    print a.read()
except:
    pass

I also tried importing codecs and the code below:

a=codecs.open('myfile.txt', 'r', encoding='utf-8') 
print a.read()

but still same error is popping out.

解决方案

Correcting my answer for encoding in print statement: Avoid printing to stdout Windows, because Python assumes that CMD terminal can only handle Windows-1252 (MS copy of ISO of latin-1). This is easily sidestepped by always printing to stderr instead:

import sys
print('your text', file=sys.stderr)

On Linux there should be no issue with printing Unicode correctly.

P.S.: for Python 2.x:

from __future__ import print_function
import sys
print('your text', file=sys.stderr)

P.P.S.: Original answer: For python 3.x:

a=open('myfile.txt', 'r', encoding='utf-8', errors='ignore') 

See https://docs.python.org/3/library/codecs.html#error-handlers for a detailed list of your options

这篇关于UnicodeDecodeError:'charmap'编码解码器不能在位置Y处编码字符X:字符映射到未定义的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆