在JavaScript中包含换行符的HTML? [英] Unescape HTML entities containing newline in Javascript?

查看:120
本文介绍了在JavaScript中包含换行符的HTML?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果您有一个包含HTML实体的字符串,并且想要解压缩,则可以多次建议此解决方案(或其变体):

If you have a string containing HTML entities and want to unescape it, this solution (or variants thereof) is suggested multiple times:

function htmlDecode(input){
  var e = document.createElement('div');
  e.innerHTML = input;
  return e.childNodes.length === 0 ? "" : e.childNodes[0].nodeValue;
}

htmlDecode("<img src='myimage.jpg'>"); 
// returns "<img src='myimage.jpg'>"

(参见,例如,这个答案: https://stackoverflow.com/a/1912522/1199564

(See, for example, this answer: https://stackoverflow.com/a/1912522/1199564)

只要字符串可以正常工作包含换行符,我们是不是在Internet Explorer版本10之前运行(在版本9和8上进行测试)。

This works fine as long as the string does not contain newline and we are not running on Internet Explorer version pre 10 (tested on version 9 and 8).

如果字符串包含一个换行符,IE 8和9将用空格字符代替它,而不是保持不变(如在Chrome,Safari,Firefox和IE 10上)。

If the string contains a newline, IE 8 and 9 will replace it with a space character instead of leaving it unchanged (as it is on Chrome, Safari, Firefox and IE 10).

htmlDecode("Hello\nWorld"); 
// returns "Hello World" on IE 8 and 9

任何解决方案的建议可以使用IE在版本10之前?

Any suggestions for a solution that works with IE before version 10?

推荐答案

最简单但可能不是最有效的解决方案是具有 htmlDecode()仅对字符和实体引用进行操作:

The most simple, but probably not the most efficient solution is to have htmlDecode() act only on character and entity references:

var s = "foo\n&amp;\nbar";
s = s.replace(/(&[^;]+;)+/g, htmlDecode);

使用优化的 htmlDecode()每个输入仅调用一次,仅对字符和实体引用进行操作,并重用DOM元素对象:

More efficient is using an optimized rewrite of htmlDecode() that is only called once per input, acts only on character and entity references, and reuses the DOM element object:

function htmlDecode (input)
{
  var e = document.createElement("span");

  var result = input.replace(/(&[^;]+;)+/g, function (match) {
    e.innerHTML = match;
    return e.firstChild.nodeValue;
  });

  return result;
}

/* returns "foo\n&\nbar" */
htmlDecode("foo\n&amp;\nbar");

Wladimir Palant已经指出了这个功能的XSS问题:一些(HTML5)事件侦听器属性的值,如 onerror如果您将 HTML与具有指定属性的元素分配给 innerHTML 属性,则会执行。所以你不应该在包含实际HTML的任意输入上使用这个函数,只能在已经被转义的HTML上使用。否则,您应该相应地调整正则表达式,例如使用 /(& [^;>] +;)+ / 来防止& ...; 其中 ... 包含匹配的标签。

Wladimir Palant has pointed out an XSS issue with this function: The value of some (HTML5) event listener attributes, like onerror, is executed if you assign HTML with elements that have those attributes specified to the innerHTML property. So you should not use this function on arbitrary input containing actual HTML, only on HTML that is already escaped. Otherwise you should adapt the regular expression accordingly, for example use /(&[^;<>]+;)+/ instead to prevent &…; where contains tags from being matched.

对于任意的HTML,请参阅他的替代方法,但请注意,它与此不兼容。

For arbitrary HTML, please see his alternative approach, but note that it is not as compatible as this one.

这篇关于在JavaScript中包含换行符的HTML?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆