读取字符时python中的UTF-8问题 [英] UTF-8 problem in python when reading chars

查看：66 发布时间：2020/7/13 3:44:29 python utf-8

本文介绍了读取字符时python中的UTF-8问题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用Python 2.5.这里发生了什么?我误会了什么?我该如何解决?

I'm using Python 2.5. What is going on here? What have I misunderstood? How can I fix it?

in.txt:

Stäckövérfløw

code.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-
print """Content-Type: text/plain; charset="UTF-8"\n"""
f = open('in.txt','r')
for line in f:
    print line
    for i in line:
        print i,
f.close()

输出:

Stäckövérfløw

S t � � c k � � v � � r f l � � w

推荐答案

for i in line:
    print i,

读取文件时，读取的字符串是一个字节字符串. for循环一次遍历一个字节.这会导致UTF-8编码的字符串出现问题，其中非ASCII字符由多个字节表示.如果要使用以字符为基本组成部分的Unicode对象，则应使用

When you read the file, the string you read in is a string of bytes. The for loop iterates over a single byte at a time. This causes problems with a UTF-8 encoded string, where non-ASCII characters are represented by multiple bytes. If you want to work with Unicode objects, where the characters are the basic pieces, you should use

import codecs
f = codecs.open('in', 'r', 'utf8')

如果sys.stdout尚未设置适当的编码，则可能需要将其包装:

If sys.stdout doesn't already have the appropriate encoding set, you may have to wrap it:

sys.stdout = codecs.getwriter('utf8')(sys.stdout)

这篇关于读取字符时python中的UTF-8问题的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

读取字符时python中的UTF-8问题 [英] UTF-8 problem in python when reading chars

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

读取字符时python中的UTF-8问题 [英] UTF-8 problem in python when reading chars

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭