我可以在 robots.txt 中使用“Host"指令吗? [英] Can I use the “Host” directive in robots.txt?

查看：63 发布时间：2021/7/10 19:18:42 seo robots.txt

本文介绍了我可以在 robots.txt 中使用“Host"指令吗?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在robots.txt 上搜索特定信息，我偶然发现了一个Yandex 帮助页面^‡ 关于此主题.它表明我可以使用 Host 指令告诉爬虫我的首选镜像域:

Searching for specific information on the robots.txt, I stumbled upon a Yandex help page^‡ on this topic. It suggests that I could use the Host directive to tell crawlers my preferred mirror domain:

User-Agent: *
Disallow: /dir/
Host: www.example.com

此外，维基百科文章指出 Google 也了解 Host 指令，但没有太多(即没有)信息.

Also, the Wikipedia article states that Google too understands the Host directive, but there wasn’t much (i.e. none) information.

在 robotstxt.org，我没有在 Host 上找到任何内容(或 Crawl-delay 如维基百科所述).

At robotstxt.org, I didn’t find anything on Host (or Crawl-delay as stated on Wikipedia).

是否鼓励使用 Host 指令?
Google 是否有专门针对此 robots.txt 的资源?
与其他抓取工具的兼容性如何?

^‡ 至少从 2021 年初开始，链接条目不再处理相关指令.

我可以在 robots.txt 中使用“Host"指令吗? [英] Can I use the “Host” directive in robots.txt?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

我可以在 robots.txt 中使用“Host"指令吗? [英] Can I use the “Host” directive in robots.txt?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭