robots.txt 网址格式 [英] robots.txt URL format

查看：53 发布时间：2021/7/10 19:18:30 robots.txt

本文介绍了robots.txt 网址格式的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在 User-agent 或 Disallow 行中不支持通配符和正则表达式

globbing and regular expression are not supported in either the User-agent or Disallow lines

但是，我注意到 stackoverflow robots.txt 包含像 * 和 ?在网址中.这些是否受支持?

However, I noticed that the stackoverflow robots.txt includes characters like * and ? in the URLs. Are these supported or not?

此外，URL 是否包含尾部斜杠，或者这两个等价物是否有任何区别?

Also, does it make any difference whether a URL includes a trailing slash, or are these two equivalent?

Disallow: /privacy
Disallow: /privacy/

推荐答案

你的第二个问题，两者不等价./privacy 将阻止任何以 /privacy 开头的内容，包括诸如 /privacy_xyzzy 之类的内容.另一方面，/privacy/ 不会阻止.

Your second question, the two are not equivalent. /privacy will block anything that starts with /privacy, including something like /privacy_xyzzy. /privacy/, on the other hand, would not block that.

原始 robots.txt 不支持通配符或通配符.但是，许多机器人都这样做.几年前，谷歌、微软和雅虎就标准达成了一致.请参阅 http://googlewebmastercentral.blogspot.com/2008/06/improving-on-robots-exclusion-protocol.html 了解详情.

The original robots.txt did not support globbing or wildcards. However, many robots do. Google, Microsoft, and Yahoo agreed on a standard a few years back. See http://googlewebmastercentral.blogspot.com/2008/06/improving-on-robots-exclusion-protocol.html for details.

我所知道的大多数主要机器人都支持该标准".

Most major robots that I know of support that "standard."

这篇关于robots.txt 网址格式的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

robots.txt 网址格式 [英] robots.txt URL format

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

robots.txt 网址格式 [英] robots.txt URL format

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭