robots.txt - 这有效吗? [英] robots.txt - is this working?
问题描述
我刚刚遇到了一个看起来像这样的 robots.txt:
I just ran into a robots.txt that looks like this:
User-agent: *
Disallow: /foobar
User-agent: badbot
Disallow: *
在仅禁止所有文件夹后,特定的 badbot
规则是否适用?
After disallowing only a few folders for all, does the specific badbot
rule even apply?
注意:此问题仅用于理解上述规则集.我知道使用 robots.txt 不是一种适当的安全机制,我既不使用也不提倡它.
Note: This question is merely for understanding the above ruleset. I know using robots.txt is not a proper security mechanism and I'm neither using nor advocating it.
推荐答案
每个机器人最多只能遵守一条记录(块).
Each bot only ever complies to at most a single record (block).
一个块以一行或更多User-agent
行开始,通常后跟Disallow
行(至少需要一行).块由空行分隔.
A block starts with one or more User-agent
lines, typically followed by Disallow
lines (at least one is required). Blocks are separated by blank lines.
名为badbot"的机器人将查找带有 User-agent: badblock<行的记录/code> (或类似的,因为机器人在解释这个领域时应该是自由的").如果没有找到这样的行,它将查找带有
User-agent: *
行的记录.如果即使这不存在,机器人也可以做任何事情(=默认).
A bot called "badbot" will look for a record with the line User-agent: badblock
(or similar, as the bot "should be liberal in interpreting this field"). If no such line is found, it will look for a record with the line User-agent: *
. If even this doesn’t exist, the bot is allowed to do everything (= default).
因此,在您的示例中,名为badbot"的机器人将仅跟踪第二条记录(您的意思可能是 Disallow:/
而不是 Disallow: *
),而所有其他机器人只遵循第一条记录.
So in your example, the bot called "badbot" will follow only the second record (you probably mean Disallow: /
instead of Disallow: *
), while all other bots only follow the first record.
这篇关于robots.txt - 这有效吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!