如何阻止搜索引擎索引所有以 origin.domainname.com 开头的 url [英] How to block search engines from indexing all urls beginning with origin.domainname.com

查看:12
本文介绍了如何阻止搜索引擎索引所有以 origin.domainname.com 开头的 url的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有 www.domainname.com、origin.domainname.com 指向相同的代码库.有没有办法,我可以防止 basename origin.domainname.com 的所有 url 被索引.

I have www.domainname.com, origin.domainname.com pointing to the same codebase. Is there a way, I can prevent all urls of basename origin.domainname.com from getting indexed.

robot.txt 中是否有一些规则可以做到这一点.这两个网址都指向同一个文件夹.另外,我尝试在 htaccess 文件中将 origin.domainname.com 重定向到 www.domainname.com 但它似乎不起作用..

Is there some rule in robot.txt to do it. Both the urls are pointing to the same folder. Also, I tried redirecting origin.domainname.com to www.domainname.com in htaccess file but it doesnt seem to work..

如果哪位遇到过类似的问题并且可以提供帮助,我将不胜感激.

If anyone who has had a similar kind of problem and can help, I shall be grateful.

谢谢

推荐答案

您可以将 robots.txt 改写为另一个文件(让我们将此命名为robots_no.txt",其中包含:

You can rewrite robots.txt to an other file (let's name this 'robots_no.txt' containing:

User-Agent: *
Disallow: /

(来源:http://www.robotstxt.org/robotstxt.html)

.htaccess 文件如下所示:

The .htaccess file would look like this:

RewriteEngine On
RewriteCond %{HTTP_HOST} !^www.example.com$
RewriteRule ^robots.txt$ robots_no.txt

为每个(子)域使用自定义的 robots.txt:

Use customized robots.txt for each (sub)domain:

RewriteEngine On
RewriteCond %{HTTP_HOST} ^www.example.com$ [OR]
RewriteCond %{HTTP_HOST} ^sub.example.com$ [OR]
RewriteCond %{HTTP_HOST} ^example.com$ [OR]
RewriteCond %{HTTP_HOST} ^www.example.org$ [OR]
RewriteCond %{HTTP_HOST} ^example.org$
# Rewrites the above (sub)domains <domain> to robots_<domain>.txt
# example.org -> robots_example.org.txt
RewriteRule ^robots.txt$ robots_${HTTP_HOST}.txt [L]
# in all other cases, use default 'robots.txt'
RewriteRule ^robots.txt$ - [L]

您可以使用 <link rel="canonical">,而不是要求搜索引擎阻止除 www.example.com 之外的所有页面

Instead of asking search engines to block all pages on for pages other than www.example.com, you can use <link rel="canonical"> too.

如果 http://example.com/page.htmlhttp://example.org/~example/page.html 都指向 http://www.example.com/page.html,将下一个标签放在:

If http://example.com/page.html and http://example.org/~example/page.html both point to http://www.example.com/page.html, put the next tag in the <head>:

<link rel="canonical" href="http://www.example.com/page.html">

另请参阅 Google 关于 rel="canonical" 的文章

这篇关于如何阻止搜索引擎索引所有以 origin.domainname.com 开头的 url的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆