使用 robots.txt 在子域上禁止或无索引 [英] Disallow or Noindex on Subdomain with robots.txt
问题描述
我在不同的子域上托管了 dev.example.com 和 www.example.com.我希望爬虫删除 dev
子域的所有记录,但将它们保留在 www
上.我使用 git 来存储两者的代码,所以理想情况下我希望两个站点使用相同的 robots.txt 文件.
I have dev.example.com and www.example.com hosted on different subdomains. I want crawlers to drop all records of the dev
subdomain but keep them on www
. I am using git to store the code for both, so ideally I'd like both sites to use the same robots.txt file.
是否可以使用一个 robots.txt 文件并让它从 dev
子域中排除爬虫?
Is it possible to use one robots.txt file and have it exclude crawlers from the dev
subdomain?
推荐答案
抱歉,这很可能是不可能的.一般规则是单独处理每个子域,因此都需要 robots.txt 文件.
Sorry, this is most likely not possible. The general rule is that each sub-domain is treated separately and thus would both need robots.txt files.
子域通常使用带有 url 重写的子文件夹实现,该子文件夹执行您希望跨子域共享单个 robots.txt 文件的映射.这里有一个关于如何做到这一点的很好的讨论:http://www.webmasterworld.com/apache/4253501.htm.
Often subdomains are implemented using subfolders with url rewriting in place that does the mapping in which you want to share a single robots.txt file across subdomains. Here's a good discussion of how to do this: http://www.webmasterworld.com/apache/4253501.htm.
但是,在您的情况下,您希望每个子域都有不同的行为,这将需要单独的文件.
However, in your case you want different behavior for each subdomain which is going to require separate files.
这篇关于使用 robots.txt 在子域上禁止或无索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!