无法使用单个未转义的反斜杠将json存储在python中 [英] Impossible to store json in python with single un-escaped backslash

查看:41
本文介绍了无法使用单个未转义的反斜杠将json存储在python中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为REST有效负载主体创建一个json主体,如下所示:

I am creating a json body for a REST payload body like so:

>>> j = json.loads('["foo", {"bar": ["to_be_replaced", 1.1, 1.0, 2]}]')
>>> text = "aaaa" + "\\" + "bbbbb" + "\\" + "cccc"
>>> j[1]["bar"][0] = text
>>> j
['foo', {'bar': ['aaaa\\bbbbb\\cccc', 1.1, 1.0, 2]}]

令人讨厌的是,另一面的预期格式是

Annoyingly, the format expected on the other side is like so

"aaaa\bbbb\cccc". 

一个可怕的主意,我知道.

A terrible idea, I know.

我已经尝试了所有方法,并且开始相信将这种格式的文本存储在json对象中根本是不可能的.有办法吗?还是我需要让Web服务的开发人员选择更明智的分隔符.

I have tried everything and am starting to believe it's simply impossible to store text in this format in a json object. Is there a way? Or do I need to get the developers of the webservice to choose a more sensible delimiter.

我知道它实际上是一个反斜杠,如果我打印,则会得到一个反斜杠

I know it's REALLY a single backslash and if I do a print a get a single backslash

>>> print(text)
aaaa\bbbbb\cccc

但这并不能帮助我将其放入json对象中.

But that doesn't help me get it into a json object.

推荐答案

是的, 是不可能的-根据设计.

从本质上讲,JSON解析器应该只发出有效的JSON.摘自 RFC 8259 ,重点是我的:

Yes, it is impossible -- by design.

A JSON parser is, by nature, supposed to emit only valid JSON. From RFC 8259, emphasis mine:

7.字符串

字符串的表示类似于C语言中使用的约定编程语言家族.字符串以开头和结尾引号.所有Unicode字符都可以放在引号,必须转义的字符除外引号,反固相线和控制字符(U + 0000通过U + 001F).

7. Strings

The representation of strings is similar to conventions used in the C family of programming languages. A string begins and ends with quotation marks. All Unicode characters may be placed within the quotation marks, except for the characters that MUST be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).

任何字符都可以转义.如果角色在基本版中多语言平面(U + 0000到U + FFFF),则可能是表示为六个字符的序列:反向固相线,后跟小写字母u,然后是四个十六进制数字编码字符的代码点.十六进制字母A到F可以是大写或小写.例如,一个字符串仅包含一个反向固线符的字符可能会被表示为"\ u005C".

Any character may be escaped. If the character is in the Basic Multilingual Plane (U+0000 through U+FFFF), then it may be represented as a six-character sequence: a reverse solidus, followed by the lowercase letter u, followed by four hexadecimal digits that encode the character's code point. The hexadecimal letters A through F can be uppercase or lowercase. So, for example, a string containing only a single reverse solidus character may be represented as "\u005C".

或者,有两个字符的序列转义一些流行角色的表现形式.例如,仅包含单个反斜线字符的字符串可能是更加紧凑地表示为"\\" .

Alternatively, there are two-character sequence escape representations of some popular characters. So, for example, a string containing only a single reverse solidus character may be represented more compactly as "\\".


请注意短语必须逃脱"-必须"是正式定义的技术术语;某些不符合JSON规范中的MUST要求的内容不允许自己称为JSON.


Note the phrase "MUST be escaped" -- "MUST" is a formally-defined term-of-art; something which does not comply with a MUST requirement from the JSON specification is not allowed to call itself JSON.

总结:您的数据中仅包含文字反斜杠的字符串可以用JSON编码为"\ u005c" "\\" .它可能可能不会被编码为"\" (包括该字符作为未转义的文字).

In summary: A string containing only a literal backslash in your data may be encoded in JSON as "\u005c", or "\\". It may not be encoded as "\" (including that character as an unescaped literal).

这篇关于无法使用单个未转义的反斜杠将json存储在python中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆