如何禁止来自 robots.txt 的搜索页面 [英] How to disallow search pages from robots.txt
问题描述
我需要禁止 http://example.com/startup?page=2搜索页面以免被编入索引.
我希望将 http://example.com/startup 编入索引,但不希望 http://example.com/startup?page=2 和 page3 等等.
此外,启动可以是随机的,例如,http://example.com/XXXXX?page
Google 网站站长工具test robots.txt"功能证实了这样的事情:
用户代理:*禁止:/startup?page=
<块引用>
Disallow 该字段的值指定不属于的部分 URL被访问.这可以是一个完整的路径,或部分路径;任何以开头的 URL使用此值将不会被检索.
但是,如果 URL 的第一部分发生变化,则必须使用通配符:
用户代理:*禁止:/startup?page=禁止:*页面=禁止:*?page=
I need to disallow http://example.com/startup?page=2 search pages from being indexed.
I want http://example.com/startup to be indexed but not http://example.com/startup?page=2 and page3 and so on.
Also, startup can be random, e.g., http://example.com/XXXXX?page
Something like this works, as confirmed by Google Webmaster Tools "test robots.txt" function:
User-Agent: *
Disallow: /startup?page=
Disallow The value of this field specifies a partial URL that is not to be visited. This can be a full path, or a partial path; any URL that starts with this value will not be retrieved.
However, if the first part of the URL will change, you must use wildcards:
User-Agent: *
Disallow: /startup?page=
Disallow: *page=
Disallow: *?page=
这篇关于如何禁止来自 robots.txt 的搜索页面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!