URL,站点地图中的转义#和Google搜寻器的处理 [英] Escaped # in URLs, sitemap and handling by Google crawler

查看:113
本文介绍了URL,站点地图中的转义#和Google搜寻器的处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有大量的URL,其中一些包含哈希字符.哈希值不是表示片段,而是URL路径的一部分,因此我们通过%23(例如

We have a large set of URLs of which some contain a hash character. The hash is not to indicate a fragment, but part of the URL path, so we escape the hash by %23, e.g.

http://example.com/example%231
http://example.com/another-example%232
…

我们的sitemap.xml列出了以下URL:

Our sitemap.xml lists these URLs as follows:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://example.com/example%231</loc>
  </url>
  <url>
    <loc>http://example.com/another-example%232</loc>
  </url>
  <!-- and so on … -->
</urlset>

现在,Google搜索控制台报告了以下URL的404错误:

Now, the Google Search Console reports 404 errors for the following URLs:

http://example.com/example
http://example.com/another-example

请注意,%23之后的字符串被剥离了.如果站点地图包含例如http://example.com/example#1,但我们有意对哈希(http://example.com/example%231)进行编码.

Note, that the strings after the %23 got stripped away. I would understand this behavior, if the sitemap contained e.g. http://example.com/example#1, but we’re intentionally encoding the hash (http://example.com/example%231).

有什么我可能会误会的东西吗?或者在sitemap.xml中是否有任何特殊的转义规则?

Is there anything I might be misunderstanding, or are there any special rules for escaping within sitemap.xml?

推荐答案

Google

Google don't want you to use fragments in that way. They do, however, still see them as actual fragment identifiers, e.g. direct links from a search result to multiple subheadings of Wikipedia articles.

因此Google可能会将您的哈希解释为片段ID,因此将其从您的网址中剥离,从而获得404.

So Google probably interprets your hashes as fragment IDs, and therefore strips them from your URLs, thereby getting 404s.

遵循XML Sitemaps RSC 3986 .关于 Google已弃用 Ajax使用!#网址的历史这可能是有用的背景.

XML Sitemaps follow usual escaping set out in RSC 3986. There's some history around Google's deprecated use of !# URLs for Ajax that may be useful background.

这篇关于URL,站点地图中的转义#和Google搜寻器的处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆