我应该如何转义 JSON 中的字符串? [英] How should I escape strings in JSON?

查看:38
本文介绍了我应该如何转义 JSON 中的字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

手动创建 JSON 数据时,我应该如何转义字符串字段?我应该使用 Apache Commons Lang 的 StringEscapeUtilities.escapeHtmlStringEscapeUtilities.escapeXml 之类的东西,还是应该使用 java.net.URLEncoder?

When creating JSON data manually, how should I escape string fields? Should I use something like Apache Commons Lang's StringEscapeUtilities.escapeHtml, StringEscapeUtilities.escapeXml, or should I use java.net.URLEncoder?

问题是,当我使用 SEU.escapeHtml 时,它不会转义引号,并且当我将整个字符串包装在一对 ' 中时,格式错误将生成 JSON.

The problem is that when I use SEU.escapeHtml, it doesn't escape quotes and when I wrap the whole string in a pair of 's, a malformed JSON will be generated.

推荐答案

理想情况下,找到一个以您的语言编写的 JSON 库,您可以为其提供一些适当的数据结构,让它担心关于如何逃避事物.它会让你更加清醒.如果出于某种原因,您的语言中没有库,您不想使用它(我不建议这样做¹),或者您正在编写一个 JSON 库,请继续阅读.

Ideally, find a JSON library in your language that you can feed some appropriate data structure to, and let it worry about how to escape things. It'll keep you much saner. If for whatever reason you don't have a library in your language, you don't want to use one (I wouldn't suggest this¹), or you're writing a JSON library, read on.

根据 RFC 对其进行转义.JSON 非常自由:您必须转义的唯一字符是 " 和控制代码(任何小于 U+0020 的字符).

Escape it according to the RFC. JSON is pretty liberal: The only characters you must escape are , ", and control codes (anything less than U+0020).

这种转义结构特定于 JSON.您将需要一个 JSON 特定函数.所有的转义都可以写成 uXXXX 其中 XXXX 是该字符的 UTF-16 代码单元¹.有一些快捷方式,例如 \,也可以使用.(它们会产生更小更清晰的输出.)

This structure of escaping is specific to JSON. You'll need a JSON specific function. All of the escapes can be written as uXXXX where XXXX is the UTF-16 code unit¹ for that character. There are a few shortcuts, such as \, which work as well. (And they result in a smaller and clearer output.)

有关完整详情,请参阅RFC.

For full details, see the RFC.

¹JSON 的转义是建立在 JS 之上的,所以它使用 uXXXX,其中 XXXX 是一个 UTF-16 编码单元.对于 BMP 之外的代码点,这意味着编码代理对,这可能有点麻烦.(或者,您可以直接输出字符,因为 JSON 的编码是 Unicode 文本,并允许这些特定字符.)

¹JSON's escaping is built on JS, so it uses uXXXX, where XXXX is a UTF-16 code unit. For code points outside the BMP, this means encoding surrogate pairs, which can get a bit hairy. (Or, you can just output the character directly, since JSON's encoded for is Unicode text, and allows these particular characters.)

这篇关于我应该如何转义 JSON 中的字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆