Python:为什么我得到一个UnicodeDecodeError? [英] Python: Why am I getting a UnicodeDecodeError?
问题描述
我有下面的代码,使用RE搜索文件,如果找到任何匹配,它将文件移动到不同的目录。
进口os
进口gzip
进口re
进口shutil
def regEx1():
os.chdir(C:/ Users / David / myfiles)
files = os.listdir(。)
os.mkdir(C :/ b)($ / $ / $ / $ / $)
regex_txt = input(请输入你正在寻找的字符串:)
(文件)中的x:
inputFile = open((x ),r)
content = inputFile.read()
inputFile.close()
regex = re.compile(regex_txt,re.IGNORECASE)
如果re.search (正则表达式,内容)不是None:
shutil.copy(x,C:/ Users / David / NewFiles)
当我运行它时,我得到以下错误信息:
pre code Traceback ):
在< module>中,第1行的文件< interactive input>
文件C:\Python33\Lib\encodings\cp1252.py,第23行,解码
返回codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError:'charmap'编解码器无法解码位置367中的字节0x9d:字符映射到< undefined>
有人可能会解释为什么会显示此讯息。
由于您没有指定用于读取文件的编码,所以平台默认(来自 locale.getpreferredencoding
)正在被使用, case。
您需要指定一个可以解码文件内容的编码,或者以二进制模式打开文件(并使用 b>'
字节模式为您的正则表达式)
请参阅 Python Unicode HOWTO 获取更多信息。
I have the following code that search through files using RE's and if any matches are found it move the file into a different directory.
import os
import gzip
import re
import shutil
def regEx1():
os.chdir("C:/Users/David/myfiles")
files = os.listdir(".")
os.mkdir("C:/Users/David/NewFiles")
regex_txt = input("Please enter the string your are looking for:")
for x in (files):
inputFile = open((x), "r")
content = inputFile.read()
inputFile.close()
regex = re.compile(regex_txt, re.IGNORECASE)
if re.search(regex, content)is not None:
shutil.copy(x, "C:/Users/David/NewFiles")
When I run it i get the following error message:
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
File "C:\Python33\Lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 367: character maps to <undefined>
Please could someone explain why this message appears
In python 3, when you open a file for reading in text mode (r
) it'll decode the contained text to unicode.
Since you didn't specify what encoding to use to read the file, the platform default (from locale.getpreferredencoding
) is being used, and that fails in this case.
You need to either specify an encoding that can decode the file contents, or open the file in binary mode instead (and use b''
bytes patterns for your regular expressions).
See the Python Unicode HOWTO for more information.
这篇关于Python:为什么我得到一个UnicodeDecodeError?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!