Python - Python 3.1似乎不能处理UTF-16编码文件? [英] Python - Python 3.1 can't seem to handle UTF-16 encoded files?

查看:158
本文介绍了Python - Python 3.1似乎不能处理UTF-16编码文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图运行一些代码,以简单地通过一堆文件,将那些发生的.txt文件写入同一个文件,删除所有的空间。这里有一些简单的代码,应该做的诀窍:

 对于subdir,dirs,os.walk(rootdir) $ b文件中的文件:
if'.txt'in file:
f = open(subdir +'/'+ file,'r')
line = f.readline $ b while line:
line2 = line.split()
如果line2:
output_file.write(.join(line2)+'\\\
')
= f.readline()
f.close()

以下错误:



文件/usr/lib/python3.1/codecs.py,第300行,decode
)= self._buffer_decode(data,self.errors,final)
UnicodeDecodeError:'utf8'编解码器无法解码位置0中的字节0xfe:意外的代码字节



原来这些.txt文件都是UTF-16(根据FireFox,无论如何)。我认为Python 3.x应该能够处理任何类型的字符编码。



Best,
Georgina

解决方案

使用 open(bla,'r',encoding =utf-16) p>

I'm trying to run some code to simply go through a bunch of files and write those that happen to be .txt files into the same file, removing all the spaces. Here's some simple code that should do the trick:

for subdir, dirs, files in os.walk(rootdir):
for file in files:
    if '.txt' in file:
        f = open(subdir+'/'+file, 'r')
        line = f.readline()
        while line:
            line2 = line.split()
            if line2:
                output_file.write(" ".join(line2)+'\n')
            line = f.readline()
        f.close()

But instead, I get the following error:

File "/usr/lib/python3.1/codecs.py", line 300, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf8' codec can't decode byte 0xfe in position 0: unexpected code byte

It turns out these .txt files are all in UTF-16 (according to FireFox, at any rate). I thought Python 3.x was supposed to be able to handle any sort of character encoding??

Best, Georgina

解决方案

Use open(bla, 'r', encoding="utf-16").

这篇关于Python - Python 3.1似乎不能处理UTF-16编码文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆