当我拥有robots.txt时,是否应该删除元机器人(索引,请遵循)? [英] Should I remove meta-robots (index, follow) when I have a robots.txt?

查看:75
本文介绍了当我拥有robots.txt时,是否应该删除元机器人(索引,请遵循)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我想让搜索引擎遵循我的robots.txt规则,我是否应该删除机器人元标记感到有些困惑.

I'm a bit confused whether I should remove the robots meta tag, if I want search engines to follow my robots.txt rules.

如果页面上存在robots元标记(索引,请跟随),搜索引擎会否然后忽略我的robots.txt文件并为我的robots.txt中指定的不允许的URL编制索引?

If the robots meta-tag (index, follow) exists on the page, will search engines then ignore my robots.txt file and index the specified disallowed URLs in my robots.txt anyway?

之所以问这个问题,是因为搜索引擎(主要是Google)仍然为我网站上不允许的页面编制索引.

The reason why I'm asking about this, is that search engines (Google mainly) still indexes disallowed pages from my website.

推荐答案

如果搜索引擎的漫游器尊重您的robots.txt,并且您禁止抓取/foo,则该漫游器将永远不会抓取URL路径以/foo.因此,该漫游器永远不会知道存在meta-robots个元素.

If a search engine’s bot honors your robots.txt, and you disallow crawling of /foo, then the bot will never crawl pages whose URL paths start with /foo. Hence the bot will never know that there are meta-robots elements.

相反,这意味着,如果要禁止对页面进行索引(通过将meta-robotsnoindex一起指定),则不应禁止爬网此页面上的>在robots.txt中.否则,将永远不会访问noindex,并且该漫游器会认为禁止爬网,而不是 indexing .

Conversely, this means that if you want to disallow indexing a page (by specyfing meta-robots with noindex), you should not disallow crawling of this page in your robots.txt. Otherwise the noindex is never accessed, and the bot thinks that crawling is forbidden, not indexing.

这篇关于当我拥有robots.txt时,是否应该删除元机器人(索引,请遵循)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆