为什么我的Python代码会打印出多余的字符“ï"?从文本文件读取时? [英] Why does my Python code print the extra characters "" when reading from a text file?

查看:183
本文介绍了为什么我的Python代码会打印出多余的字符“ï"?从文本文件读取时?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

try:
    data=open('info.txt')
    for each_line in data:
        try:
            (role,line_spoken)=each_line.split(':',1)
            print(role,end='')
            print(' said: ',end='')
            print(line_spoken,end='')
        except ValueError:
            print(each_line)
    data.close()
except IOError:
     print("File is missing")

逐行打印文件时,代码倾向于在前面添加三个不必要的字符,即".

实际输出:

 Man said:  Is this the right room for an argument?
Other Man said:  I've told you once.
Man said:  No you haven't!
Other Man said:  Yes I have.
 

预期输出:

 Man said:  Is this the right room for an argument?
Other Man said:  I've told you once.
Man said:  No you haven't!
Other Man said:  Yes I have.
 

解决方案

我找不到适用于Python 3的副本,该Python 3处理的编码与Python 2不同.因此,这就是答案:而不是使用默认编码(为'utf-8'),请使用'utf-8-sig',该编码应去除 UTF-8字节顺序标记,它显示为.

也就是说,而不是

data = open('info.txt')

data = open('info.txt', encoding='utf-8-sig')

请注意,如果您使用的是Python 2,则应看到例如 Python,将输出编码为UTF-8 使用Python将没有BOM的UTF-8转换为带有BOM的UTF-8 .您需要使用codecsstr.decode做一些技巧,才能在Python 2中正常工作.但是在Python 3中,您要做的就是在打开文件时设置encoding=参数. /p>

try:
    data=open('info.txt')
    for each_line in data:
        try:
            (role,line_spoken)=each_line.split(':',1)
            print(role,end='')
            print(' said: ',end='')
            print(line_spoken,end='')
        except ValueError:
            print(each_line)
    data.close()
except IOError:
     print("File is missing")

When printing the file line by line, the code tends to add three unnecessary characters in the front, namely "".

Actual output:

Man said:  Is this the right room for an argument?
Other Man said:  I've told you once.
Man said:  No you haven't!
Other Man said:  Yes I have.

Expected output:

Man said:  Is this the right room for an argument?
Other Man said:  I've told you once.
Man said:  No you haven't!
Other Man said:  Yes I have.

解决方案

I can't find a duplicate of this for Python 3, which handles encodings differently from Python 2. So here's the answer: instead of opening the file with the default encoding (which is 'utf-8'), use 'utf-8-sig', which expects and strips off the UTF-8 Byte Order Mark, which is what shows up as .

That is, instead of

data = open('info.txt')

Do

data = open('info.txt', encoding='utf-8-sig')

Note that if you're on Python 2, you should see e.g. Python, Encoding output to UTF-8 and Convert UTF-8 with BOM to UTF-8 with no BOM in Python. You'll need to do some shenanigans with codecs or with str.decode for this to work right in Python 2. But in Python 3, all you need to do is set the encoding= parameter when you open the file.

这篇关于为什么我的Python代码会打印出多余的字符“ï"?从文本文件读取时?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆