robots.txt - 这有效吗? [英] robots.txt - is this working?

查看:56
本文介绍了robots.txt - 这有效吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚遇到了一个看起来像这样的 robots.txt:

I just ran into a robots.txt that looks like this:

User-agent: *
Disallow: /foobar

User-agent: badbot
Disallow: *

在仅禁止所有文件夹后,特定的 badbot 规则是否适用?

After disallowing only a few folders for all, does the specific badbot rule even apply?

注意:此问题仅用于理解上述规则集.我知道使用 robots.txt 不是一种适当的安全机制,我既不使用也不提倡它.

Note: This question is merely for understanding the above ruleset. I know using robots.txt is not a proper security mechanism and I'm neither using nor advocating it.

推荐答案

每个机器人最多只能遵守一条记录(块).

Each bot only ever complies to at most a single record (block).

一个块以一行或更多User-agent行开始,通常后跟Disallow 行(至少需要一行).块由空行分隔.

A block starts with one or more User-agent lines, typically followed by Disallow lines (at least one is required). Blocks are separated by blank lines.

名为badbot"的机器人查找带有 User-agent: badblock<行的记录/code> (或类似的,因为机器人在解释这个领域时应该是自由的").如果没有找到这样的行,它将查找带有 User-agent: * 行的记录.如果即使这不存在,机器人也可以做任何事情(=默认).

A bot called "badbot" will look for a record with the line User-agent: badblock (or similar, as the bot "should be liberal in interpreting this field"). If no such line is found, it will look for a record with the line User-agent: *. If even this doesn’t exist, the bot is allowed to do everything (= default).

因此,在您的示例中,名为badbot"的机器人将仅跟踪第二条记录(您的意思可能是 Disallow:/ 而不是 Disallow: *),而所有其他机器人只遵循第一条记录.

So in your example, the bot called "badbot" will follow only the second record (you probably mean Disallow: / instead of Disallow: *), while all other bots only follow the first record.

这篇关于robots.txt - 这有效吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆