Robots.txt 拒绝，因为 #!网址 [英] Robots.txt deny, for a #! URL

查看：35 发布时间：2021/7/10 19:19:29 javascript robots.txt

本文介绍了Robots.txt 拒绝，因为 #!网址的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试向 robots.txt 文件添加拒绝规则，以拒绝对单个页面的访问.

I am trying to add a deny rule to a robots.txt file, to deny access to a single page.

网站网址的工作方式如下:

The website URLs work as follows:

Javascript 然后根据 URL 换出显示的 DIV.

Javascript then swaps out the DIV that is displayed, based on the URL.

我如何要求搜索引擎蜘蛛不列出以下内容:

How would I request a search engine spider not list the following:

提前致谢

推荐答案

实际上您可以通过多种方式来实现，但这里是最简单的两种.

You can actually do this multiple ways, but here are the two simplest.

您必须排除 Googlebot 将要获取的网址，这不是 AJAX hashbang 值，而是翻译后的 ?_escaped_fragment_=key=value

You have to exclude the URLs that Googlebot is going to fetch, which isn't the AJAX hashbang values, but the instead the translated ?_escaped_fragment_=key=value

在您的 robots.txt 文件中指定:

In your robots.txt file specify:

Disallow: /?_escaped_fragment_=/super-secret
Disallow: /index.php?_escaped_fragment_=/super-secret

如有疑问，您应该始终使用 Google 网站站长工具 » "Googlebot 抓取方式".

When in doubt, you should always use the Google Webmaster Tool » "Fetch As Googlebot".

如果该网页已被 Googlebot 编入索引，则使用 robots.txt 文件不会将其从索引中删除.在应用 robots.txt 后，您要么必须使用 Google 网站管理员工具 URL 删除工具，要么添加一个 noindex 命令到页面，通过标签或 X-Robots-Tag 在 HTTP 标头中.

If the page has already been indexed by Googlebot, using a robots.txt file won't remove it from the index. You'll either have to use the Google Webmaster Tools URL removal tool after you apply the robots.txt, or instead you can add a noindex command to the page via a <meta> tag or X-Robots-Tag in the HTTP Headers.

它看起来像:

<meta name="ROBOTS" content="NOINDEX, NOFOLLOW" />

或

X-Robots-Tag: noindex

这篇关于Robots.txt 拒绝，因为 #!网址的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Robots.txt 拒绝，因为 #!网址 [英] Robots.txt deny, for a #! URL

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

Robots.txt 拒绝，因为 #!网址 [英] Robots.txt deny, for a #! URL

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭