Robots.txt 拒绝,因为 #!网址 [英] Robots.txt deny, for a #! URL

查看:35
本文介绍了Robots.txt 拒绝,因为 #!网址的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试向 robots.txt 文件添加拒绝规则,以拒绝对单个页面的访问.

I am trying to add a deny rule to a robots.txt file, to deny access to a single page.

网站网址的工作方式如下:

The website URLs work as follows:

Javascript 然后根据 URL 换出显示的 DIV.

Javascript then swaps out the DIV that is displayed, based on the URL.

我如何要求搜索引擎蜘蛛不列出以下内容:

How would I request a search engine spider not list the following:

提前致谢

推荐答案

实际上您可以通过多种方式来实现,但这里是最简单的两种.

You can actually do this multiple ways, but here are the two simplest.

您必须排除 Googlebot 将要获取的网址,这不是 AJAX hashbang 值,而是翻译后的 ?_escaped_fragment_=key=value

You have to exclude the URLs that Googlebot is going to fetch, which isn't the AJAX hashbang values, but the instead the translated ?_escaped_fragment_=key=value

在您的 robots.txt 文件中指定:

In your robots.txt file specify:

Disallow: /?_escaped_fragment_=/super-secret
Disallow: /index.php?_escaped_fragment_=/super-secret

如有疑问,您应该始终使用 Google 网站站长工具 » "Googlebot 抓取方式".

When in doubt, you should always use the Google Webmaster Tool » "Fetch As Googlebot".

如果该网页已被 Googlebot 编入索引,则使用 robots.txt 文件不会将其从索引中删除.在应用 robots.txt 后,您要么必须使用 Google 网站管理员工具 URL 删除工具,要么添加一个 noindex 命令到页面,通过 标签或 X-Robots-Tag 在 HTTP 标头中.

If the page has already been indexed by Googlebot, using a robots.txt file won't remove it from the index. You'll either have to use the Google Webmaster Tools URL removal tool after you apply the robots.txt, or instead you can add a noindex command to the page via a <meta> tag or X-Robots-Tag in the HTTP Headers.

它看起来像:

<meta name="ROBOTS" content="NOINDEX, NOFOLLOW" />

X-Robots-Tag: noindex

这篇关于Robots.txt 拒绝,因为 #!网址的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆