超链接href在innerHTML中错误地引用了吗? [英] Hyperlink href incorrectly quoted in innerHTML?

查看:88
本文介绍了超链接href在innerHTML中错误地引用了吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以这个非常简单的HTML示例为例:

Take this very simple example HTML:

<html>
    <body>This is okay &amp; fine, but the encoding of <a href="http://example.com?a=1&b=2">this link</a> seems wrong.</body>
<html>

在检查document.body.innerHTML时(例如,在浏览器的JS控制台中,在JS本身中,等等),这是我看到的值:

On examining document.body.innerHTML (e.g. in the browser's JS console, in JS itself, etc.), this is the value I see:

This is okay &amp; fine, but the encoding of <a href="http://example.com?a=1&amp;b=2">this link</a> seems wrong.

在浏览器中这种行为是相同的,但是我不明白,这似乎是错误的.

This behaviour is the same across browsers but I can't understand it, it seems wrong.

具体来说,原始文档中的链接是指向http://example.com?a=1&b=2的,而如果innerHTML的值被视为HTML,则它链接到的http://example.com?a=1&amp;b=2会有所不同(例如,如果我创建了一个新文档, (实际上以innerHTML作为其内部HTML,然后单击链接,就我所知,浏览器将被发送到一个完全不同的URL).

Specifically, the link in the orginal document is to http://example.com?a=1&b=2, whereas if the value of innerHTML is treated as HTML then it links to http://example.com?a=1&amp;b=2 which is NOT the same (e.g. If I created a new document, which actually had innerHTML as its inner HTML, and I clicked on the link then the browser would be sent to a materially different URL as far as I can see).

(编辑#3:我对上述内容有误.首先,是的,这两个URL不同;但是,第二,我认为是错误的innerHTML是正确的,它正确地代表了第一个URL,而不是第二个!请在下面查看我自己的答案的结尾.)

(EDIT #3: I'm wrong about the above. Firstly, yes, those two URLs are different; but secondly, the innerHTML which I thought was wrong is right, and it correctly represents the first URL, not the second! See the end of my own answer below.)

这与问题 innerHTML给我&作为& amp; !.在我的情况下(与该问题的情况相反),原始HTML是正确的,并且在我看来好像是innerHTML这是错误的(即,因为HTML不能代表原始HTML的含义)表示).

This is different from the issue discussed in question innerHTML gives me & as &amp; !. In my case (which is the opposite to the case in that question) the original HTML is correct and it looks to me as if it is the innerHTML which is wrong (i.e. because it is HTML which does not represent what the original HTML represented).

(编辑#2:我对此也错了:并没有什么不同.但是我认为,&amp;是在href内表示&的正确方法并不广为人知,不是.一旦您意识到这一点,便可以发现它们确实是同一问题.)

(EDIT #2: I was wrong about this, too: it's not really different. But I think it is not widely known that &amp; is the correct way to represent & inside an href, not just within body text. Once you realise that, then you can see that these are the same issue really.)

有人可以解释吗?

(编辑#1 + 4:这只是在写完我的原始问题后才对我有点迟,但是:"&amp;实际上在href文本中是正确,而从技术上讲是不正确的?"正如我刚写这些词时所说的那样,似乎不太可能!我从来没有见过以这种方式编写过HTML."但是无论是否不太可能",情况都是如此,也是我不了解的主要部分!)

(EDIT #1+4: This only occurred to me a bit late, after writing my original question, but: "is &amp; actually correct within the href text, and & technically incorrect?" As I said when I first wrote those words, that "seems very unlikely! I've certainly never seen HTML written that way." But however 'unlikely', or not, that is the case, and is the main part of what I wasn't understanding!)

也相关且有用,谁能解释如何干净地获取确实正确表示文档链接目标的HTML?您绝对不能只对innerHTML中的所有HTML字符引用进行未编码,因为(如我所使用的示例所示,并且也如

Also related and would be useful, can anyone explain how to cleanly get HTML which does correctly represent the target of document links? You definitely can't just un-encode all HTML character references within innerHTML, because (as shown in the example I've used, and also as discussed in innerHTML gives me & as &amp; !) the ones in the main run of text should be encoded, and just un-encoding everything would make these wrong.

我本来以为这不是 innerHTML给我&作为& amp; !(如上所述),但在某种程度上它仍然不是,如果人们同意,对于内部href所用的问题与正文中所涉及的问题一样不那么明显或广为所知).绝对不是 innerHTML中的a href (有人不清楚地询问如何使用JS设置innerHTML)的副本. /p>

I originally thought this was not a duplicate of innerHTML gives me & as &amp; ! (as discussed above; and in a way it still isn't, if it's agreed that it's not as obvious or widely known that the same issues apply inside href as in body text). It's still definitely not a duplicate of A href in innerHTML (which somehwat unclearly asks about how to set innerHTML using JS).

推荐答案

大多数浏览器工具不会显示实际的HTML,因为它没有太大帮助:

Most browser tools don't show the actual HTML because it wouldn't be of much help:

  • HTML通常是在页面加载后借助CSS和JavaScript动态生成的.
  • HTML经常损坏,浏览器需要对其进行修复,以生成呈现和其他内容所需的内存表示形式.

因此,您看到的HTML并不是实际的来源,而是由文档的当前状态即时生成的,当然,它包括所有已应用的固定值(在您的情况下为无效的HTML实体).

So the HTML you see is not the actual source but it's generated on the fly from the current status of the document, which of course includes all the fixed applied (in your case, the invalid HTML entities).

以下示例有望说明所有组合:

The following example hopefully illustrates all the combinations:

const section = document.querySelector("section");
const invalid = document.createElement("p");
invalid.innerHTML = '<a href="http://example.com/?a=1&b=2">Invalid HTML (dynamic)</a>';
const valid = document.createElement("p");
valid.innerHTML = '<a href="http://example.com/?a=1&amp;b=2">Valid HTML (dynamic)</a>';
section.appendChild(valid);
section.appendChild(invalid);
const paragraphs = document.querySelectorAll("p");
for (p of paragraphs) {
  console.log(p.innerHTML);
}
const links = document.querySelectorAll("a");
for (a of links) {
  console.log(a.getAttribute("href"));
}

<section>
  <p><a href="http://example.com/?a=1&b=2">Invalid HTML (static)</a></p>
  <p><a href="http://example.com/?a=1&amp;b=2">Valid HTML (static)</a></p>
<section>

&amp;在href文本内实际上是正确的,而&在技术上是否正确?似乎不太可能!我当然从来没有看过HTML这样写过.

Is &amp; actually correct within the href text, and & technically incorrect? It seems very unlikely! I've certainly never seen HTML written that way.

没有技术上正确的"之类的东西,更不用说当HTML很好地标准化时了. (嗯,是的,有两个相互竞争的标准机构,并且规范在不断发展,但是基础已经建立很久了.)

There's no such thing as "technically correct", let alone today when HTML is pretty well standardised. (Well, yes, there're two competing standards bodies and specs are continuously evolving, but the basics were set up long ago.)

&符号开始一个字符实体,而&b是无效的字符实体.期间.

The & symbol starts a character entity and &b is an invalid character entity. Period.

但是有效!这不是说在技术上是正确的吗?

But it works! Doesn't that mean it's technically correct?

之所以起作用,是因为认为浏览器是专门为处理完全破坏的标记而设计的,即所谓的标记汤,因为它认为这样可以简化用法:

It works because browsers are explicitly designed to deal with completely broken markup, what's known as tag soup, because it was thought that it would ease usage:

<p><strong>Hello, World!</u>
<body><br itspartytime="yeah">
  <pink>It works!!!</red>

但是HTML实体只是一种编码伪像.这并不意味着URL不允许包含文字&"号,而只是意味着-在HTML上下文中时-必须表示&amp;.与在JavaScript字符串中输入反斜杠以转义某些引号的情况相同:反斜杠不会成为数据的一部分.

But HTML entities are just an encoding artefact. That doesn't mean that URLs are not allowed to contain literal ampersands, it just means that —when in HTML context— they need to be represented as &amp;. It's the same as when you type a backslash in a JavaScript string to escape some quotes: the backslash does not become part of your data.

这篇关于超链接href在innerHTML中错误地引用了吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆