从字符串中获取转义的unicode代码 [英] get escaped unicode code from string
问题描述
我似乎遇到了与开发界其他所有人相反的问题。我需要从字符串生成转义字符。例如,假设我有 MESSAGE:
这个词,我需要生成:
I seem to be having the opposite issue as everyone else in the development world. I need to generate escaped characters from strings. For instance, say I have the word MESSAGE:
, I need to generate:
\\u004D\\u0045\\u0053\\u0053\\u0041\\u0047\\u0045\\u003A\\u0053\\u0069\\u006D
我能用Python得到的最接近的东西是:
The closest thing I could get using Python was:
u'MESSAGE:'.encode('utf16')
# output = '\xff\xfeM\x00E\x00S\x00S\x00A\x00G\x00E\x00:\x00'
我的第一个想法是我可以将 \x
替换为 \u00
(或类似的东西) ,但我很快意识到那是行不通的。
My first thought was that I could replace \x
with \u00
(or something to that effect), but I quickly realized that wouldn't work. What can I do to output the escaped (unescaped?) string in Python (preferably)?
在所有人开始回答并投反对票之前,我该怎么做才能在Python中输出转义(未转义?)字符串? ,转义的 \u00 ...
字符串是我的应用程序从另一个我无法控制的第三方应用程序获取的内容。我正在尝试生成自己的测试数据,因此不必依赖于该第三方应用程序。
Before everyone starts "answering" and down voting, the escaped \u00...
string is what my app is getting from another 3rd party app which I have no control over. I'm trying to generate my own test data so I don't have to rely on that 3rd party app.
推荐答案
我认为这段(快速而肮脏的)代码可以满足您的要求:
I think this (quick & dirty) code does what you want:
''.join('\\u' + x.encode('utf_16_be').encode('hex') for x in u'MESSAGE:')
# output: '\\u004d\\u0045\\u0053\\u0053\\u0041\\u0047\\u0045\\u003a'
或者如果您想要更多'\':
Or if you want more '\':
''.join('\\\\u' + x.encode('utf_16_be').encode('hex') for x in u'MESSAGE:')
# output: '\\\\u004d\\\\u0045\\\\u0053\\\\u0053\\\\u0041\\\\u0047\\\\u0045\\\\u003a'
print _
# output: \\u004d\\u0045\\u0053\\u0053\\u0041\\u0047\\u0045\\u003a
如果您绝对需要大写的十六进制代码:
If you absolutely need upper-case for hexadecimal codes:
''.join('\\u' + x.encode('utf_16_be').encode('hex').upper() for x in u'MESSAGE:')
# output: '\\u004D\\u0045\\u0053\\u0053\\u0041\\u0047\\u0045\\u003A'
这篇关于从字符串中获取转义的unicode代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!