如何转换HTML标签中的转义字符? [英] How to convert escape characters in HTML tags?

查看:175
本文介绍了如何转换HTML标签中的转义字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何将\\\直接转换为< html>< html>转换为\\\使用 json.Marshal(),但是 json.Unmarshal()相当冗长繁琐。有没有直接的方法在golang?

解决方案

你可以使用 strconv.Unquote() 进行转换。

$ b $你应该注意的一件事是, strconv.Unquote()只能取消引号引用的字符串(例如,开始和结尾)引用char 或者返回引用char `),所以我们必须手动附加那个。



示例:

  //重要的是使用backtick`(raw string literal)
//否则编译器将unquote它(解释字符串文字)!

s:=`\\\`
fmt.Println(s)
s2, err:= strconv.Unquote(```+ s +``)
如果err!= nil {
panic(err)
}
fmt.Println(s2)

输出(尝试在 Go Playground ):

  \\\ 
< html>






注意执行HTML文本转义和解除转义,您可以使用 html 包。引用其文档:


软件包html提供了转义和解压缩HTML文本的功能。


但是, html 包(特别是 html.UnescapeString() )不解码 \uxxxx ,只有& #decimal; & #xHH;



示例:

  fmt.Println(html.UnescapeString(`\\\`) )// wrong 
fmt.Println(html.UnescapeString(`&#60; html&#62;`))// good
fmt.Println(html.UnescapeString(`&#x3c ; html&#x3e;`))// good

输出(在 Go Playground ):

  \\\ 
< html>
< html>






注意#2:



你还应该注意,如果你写这样的代码:

  s:=\\\

这个引用的字符串将不会被引用编译器本身,因为它是一个解释字符串文字,所以你不能真正测试。要在源中指定引用的字符串,您可以使用反引号来指定原始字符串文字,或者您可以使用双引号解释的字符串文字:

  s:=\\\//解释字符串文字(由编译器引用)
fmt.Println )

s2:=`\\\` //原始字符串文字(不会发生引号)
fmt.Println(s2)

s3:=\\\\//双引号解释字符串文字
//(编译器未引用为单引号)
fmt.Println(s3)

输出:

 code>< HTML> 
\\\


How can we directly convert "\u003chtml\u003e" to "<html>"? Conversion of "<html>" to "\u003chtml\u003e" is quite easy using json.Marshal(), but json.Unmarshal() is quite lengthy and cumbersome. Is there any direct way to do that in golang?

解决方案

You can use the strconv.Unquote() to do the conversion.

One thing you should be aware of is that strconv.Unquote() can only unquote strings that are in quotes (e.g. start and end with a quote char " or a back quote char `), so we have to manually append that.

Example:

// Important to use backtick ` (raw string literal)
// else the compiler will unquote it (interpreted string literal)!

s := `\u003chtml\u003e`
fmt.Println(s)
s2, err := strconv.Unquote(`"` + s + `"`)
if err != nil {
    panic(err)
}
fmt.Println(s2)

Output (try it on the Go Playground):

\u003chtml\u003e
<html>


Note: To do HTML text escaping and unescaping, you can use the html package. Quoting its doc:

Package html provides functions for escaping and unescaping HTML text.

But the html package (specifically html.UnescapeString()) does not decode unicode sequences of the form \uxxxx, only &#decimal; or &#xHH;.

Example:

fmt.Println(html.UnescapeString(`\u003chtml\u003e`)) // wrong
fmt.Println(html.UnescapeString(`&#60;html&#62;`))   // good
fmt.Println(html.UnescapeString(`&#x3c;html&#x3e;`)) // good

Output (try it on the Go Playground):

\u003chtml\u003e
<html>
<html>


Note #2:

You should also note that if you write a code like this:

s := "\u003chtml\u003e"

This quoted string will be unquoted by the compiler itself as it is an interpreted string literal, so you can't really test that. To specify quoted string in the source, you may use the backtick to specify a raw string literal or you may use a double quoted interpreted string literal:

s := "\u003chtml\u003e" // Interpreted string literal (unquoted by the compiler!)
fmt.Println(s)

s2 := `\u003chtml\u003e` // Raw string literal (no unquoting will take place)
fmt.Println(s2)

s3 := "\\u003chtml\\u003e" // Double quoted interpreted string literal
                           // (unquoted by the compiler to be "single" quoted)
fmt.Println(s3)

Output:

<html>
\u003chtml\u003e

这篇关于如何转换HTML标签中的转义字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆