Python 2.7:如何将字符串中的Unicode转义转换为实际的utf-8字符 [英] Python 2.7: How to convert unicode escapes in a string into actual utf-8 characters

查看：186 发布时间：2020/7/12 18:42:26 python string utf-8 converter unicode-escapes

本文介绍了Python 2.7:如何将字符串中的Unicode转义转换为实际的utf-8字符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我使用python 2.7，并且从服务器(不是unicode！)接收到 string . 在该字符串中，我发现了带有Unicode转义序列的文本.例如这样的

I use python 2.7 and I'm receiving a string from a server (not in unicode!). Inside that string I find text with unicode escape sequences. For example like this:

<a href = "http://www.mypage.com/\u0441andmoretext">\u00b2<\a>

如何将那些\uxxxx-转换回utf-8?我发现的答案是处理&#还是必需的eval()，这对我来说太慢了.对于包含此类后缀的任何文本，我都需要一个通用的解决方案.

How do I convert those \uxxxx - back to utf-8? The answers I found were either dealing with &# or required eval() which is too slow for my purposes. I need a universal solution for any text containing such sequenes.

<\a>是一个错字，但我也想容忍这种错字.应该只对\u

<\a> is a typo but I want a tolerance against such typos as well. There should only be reaction to \u

示例文本是用适当的python语法表示的，如下所示:

The example text is meant in proper python syntax like this:

"<a href = \"http://www.mypage.com/\\u0441andmoretext\">\\u00b2<\\a>"

所需的输出使用正确的python语法

The desired output is in proper python syntax

"<a href = \"http://www.mypage.com/\xd1\x81andmoretext\">\xc2\xb2<\\a>"

推荐答案

尝试

>>> s = "<a href = \"http://www.mypage.com/\\u0441andmoretext\">\\u00b2<\\a>"
>>> s.decode("raw_unicode_escape")
u'<a href = "http://www.mypage.com/\u0441andmoretext">\xb2<\\a>'

然后您可以照常编码为utf8.

And then you can encode to utf8 as usual.

这篇关于Python 2.7:如何将字符串中的Unicode转义转换为实际的utf-8字符的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python 2.7:如何将字符串中的Unicode转义转换为实际的utf-8字符 [英] Python 2.7: How to convert unicode escapes in a string into actual utf-8 characters

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python 2.7:如何将字符串中的Unicode转义转换为实际的utf-8字符 [英] Python 2.7: How to convert unicode escapes in a string into actual utf-8 characters

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭