使用“禁止:/*?"在 robots.txt 文件中 [英] Using "Disallow: /*?" in robots.txt file
问题描述
我用过
禁止:/*?
在 robots.txt 文件中禁止所有可能包含?"的页面在网址中.
那个语法是否正确,还是我也屏蔽了其他页面?
这取决于机器人.
遵循原始 robots.txt 规范的机器人不会赋予 *
任何特殊含义.这些机器人会阻止任何路径以 /*
开头,后面紧跟 ?
的 URL,例如,http://example.com/*?foo代码>.
某些漫游器,包括 Googlebot,赋予 *
字符特殊含义.它通常代表任何字符序列.这些机器人会阻止您的意图:任何带有 ?
的 URL.
Google 的 robots.txt 文档 就包含这种情况:><块引用>
阻止访问所有包含问号 (?
) 的 URL. 例如,示例代码阻止以您的域名开头,后跟任何字符串的 URL, 后跟一个问号,并以任何字符串结尾:
用户代理:Googlebot不允许:/*?
I used
Disallow: /*?
in the robots.txt file to disallow all pages that might contain a "?" in the URL.
Is that syntax correct, or am I blocking other pages as well?
It depends on the bot.
Bots that follow the original robots.txt specification don’t give the *
any special meaning. These bots would block any URL whose path starts with /*
, directly followed by ?
, e.g., http://example.com/*?foo
.
Some bots, including the Googlebot, give the *
character a special meaning. It typically stands for any sequence of characters. These bots would block what you seem to intend: any URL with a ?
.
Google’s robots.txt documentation includes this very case:
To block access to all URLs that include question marks (
?
). For example, the sample code blocks URLs that begin with your domain name, followed by any string, followed by a question mark, and ending with any string:User-agent: Googlebot Disallow: /*?
这篇关于使用“禁止:/*?"在 robots.txt 文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!