Python文件输入字符串：如何处理转义的unicode字符？ [英] Python file input string: how to handle escaped unicode characters?

查看：196 发布时间：2020/10/19 19:56:50 python unicode utf-8 decode

本文介绍了Python文件输入字符串：如何处理转义的unicode字符？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

在文本文件（test.txt）中，我的字符串如下所示：

In a text file (test.txt), my string looks like this:

Gro\u00DFbritannien

阅读它，python转义了反斜杠：

Reading it, python escapes the backslash:

>>> file = open('test.txt', 'r')
>>> input = file.readline()
>>> input
'Gro\\u00DFbritannien'

如何将其解释为unicode ？ decode（）和 unicode（）不会完成这项工作。

How can I have this interpreted as unicode? decode() and unicode() won't do the job.

以下代码将 Gro\u00DFbritannien 写回到文件，但我希望它成为Großbritannien

The following code writes Gro\u00DFbritannien back to the file, but I want it to be Großbritannien

>>> input.decode('latin-1')
u'Gro\\u00DFbritannien'
>>> out = codecs.open('out.txt', 'w', 'utf-8')
>>> out.write(input)

您要使用 unicode_escape 编解码器：

>>> x = 'Gro\\u00DFbritannien'
>>> y = unicode(x, 'unicode_escape')
>>> print y
Großbritannien

请参见文档，了解Python标准库中包含的大量标准编码。

See the docs for the vast number of standard encodings that come as part of the Python standard library.

这篇关于Python文件输入字符串：如何处理转义的unicode字符？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文