如何转换 HTML 标签中的转义字符? [英] How to convert escape characters in HTML tags?

查看:27
本文介绍了如何转换 HTML 标签中的转义字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何将"u003chtmlu003e"直接转换为""?使用 json.Marshal()"" 转换为 "u003chtmlu003e" 非常容易,但是 json.Unmarshal() 非常冗长和繁琐.在 golang 中有没有直接的方法可以做到这一点?

How can we directly convert "u003chtmlu003e" to "<html>"? Conversion of "<html>" to "u003chtmlu003e" is quite easy using json.Marshal(), but json.Unmarshal() is quite lengthy and cumbersome. Is there any direct way to do that in golang?

推荐答案

您可以使用 strconv.Unquote() 进行转换.

You can use the strconv.Unquote() to do the conversion.

您应该注意的一件事是 strconv.Unquote() 只能取消引号中的字符串(例如,以引号字符开始和结束 " 或一个反引号字符 `),所以我们必须手动附加它.

One thing you should be aware of is that strconv.Unquote() can only unquote strings that are in quotes (e.g. start and end with a quote char " or a back quote char `), so we have to manually append that.

示例:

// Important to use backtick ` (raw string literal)
// else the compiler will unquote it (interpreted string literal)!

s := `u003chtmlu003e`
fmt.Println(s)
s2, err := strconv.Unquote(`"` + s + `"`)
if err != nil {
    panic(err)
}
fmt.Println(s2)

输出(在 Go Playground 上试试):

Output (try it on the Go Playground):

u003chtmlu003e
<html>

<小时>

注意:要进行 HTML 文本转义和反转义,您可以使用 html 包.引用它的文档:


Note: To do HTML text escaping and unescaping, you can use the html package. Quoting its doc:

包 html 提供转义和反转义 HTML 文本的功能.

Package html provides functions for escaping and unescaping HTML text.

但是 html 包(特别是 html.UnescapeString()) 不解码 uxxxx 形式的 unicode 序列,只解码 &#decimal;&#xHH;.

But the html package (specifically html.UnescapeString()) does not decode unicode sequences of the form uxxxx, only &#decimal; or &#xHH;.

示例:

fmt.Println(html.UnescapeString(`u003chtmlu003e`)) // wrong
fmt.Println(html.UnescapeString(`&#60;html&#62;`))   // good
fmt.Println(html.UnescapeString(`&#x3c;html&#x3e;`)) // good

输出(在 Go Playground 上试试):

Output (try it on the Go Playground):

u003chtmlu003e
<html>
<html>

<小时>

注意 #2:

您还应该注意,如果您编写这样的代码:

You should also note that if you write a code like this:

s := "u003chtmlu003e"

这个带引号的字符串将被编译器本身取消引用,因为它是一个解释的字符串文字,所以你不能真正测试它.要在源代码中指定带引号的字符串,您可以使用反引号指定原始字符串文字,或者您可以使用双引号解释的字符串文字:

This quoted string will be unquoted by the compiler itself as it is an interpreted string literal, so you can't really test that. To specify quoted string in the source, you may use the backtick to specify a raw string literal or you may use a double quoted interpreted string literal:

s := "u003chtmlu003e" // Interpreted string literal (unquoted by the compiler!)
fmt.Println(s)

s2 := `u003chtmlu003e` // Raw string literal (no unquoting will take place)
fmt.Println(s2)

s3 := "\u003chtml\u003e" // Double quoted interpreted string literal
                           // (unquoted by the compiler to be "single" quoted)
fmt.Println(s3)

输出:

<html>
u003chtmlu003e

这篇关于如何转换 HTML 标签中的转义字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆