Robots.txt 对类别 URL 的限制 [英] Robots.txt restriction of category URLs

查看:38
本文介绍了Robots.txt 对类别 URL 的限制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法找到有关我的案例的信息.我想限制以下类型的 URL 被编入索引:

I was unable to find information about my case. I want to restrict the following types of URLs to be indexed:

website.com/video-title/video-title/

website.com/video-title/video-title/

(我的网站生成了我的视频文章的双 URL 副本)

(my website produces such double URL copies of my video-articles)

每篇视频文章的 URL 开头都以视频"一词开头.

Each video article starts with the word "video" in the beginning of its URL.

所以我想要做的是限制所有具有 website.com/"any-url"/video-any-url" 的 URL

So what I want to do is to restrict all URLs that have website.com/"any-url"/video-any-url"

这样我会删除所有的双份副本.有人可以帮我吗?

This way I will remove all the doubled copies. Could somebody help me?

推荐答案

这在原始 robots.txt 规范中是不可能的.

This is not possible in the original robots.txt specification.

但某些解析器可能无论如何都支持 Disallow 中的通配符,例如,谷歌:

But some parsers may support wildcards in Disallow anyway, for example, Google:

Googlebot(但并非所有搜索引擎)尊重某些模式匹配.

Googlebot (but not all search engines) respects some pattern matching.

因此,对于 Google 的机器人,您可以使用以下行:

So for Google’s bots, you could use the following line:

Disallow: /*/video

这应该阻止路径以任何内容开头并包含视频"的任何 URL,例如:

This should block any URLs whose paths starts with anything, and contains "video", for example:

  • /foo/video
  • /foo/videos
  • /foo/video.html
  • /foo/video/bar
  • /foo/bar/videos
  • /foo/bar/foo/bar/videos

其他不支持此功能的解析器会按字面意思解释它,即它们会阻止以下 URL:

Other parsers not supporting this would interpret it literally, i.e., they would block the following URLs:

  • /*/video
  • /*/videos
  • /*/video/foo

这篇关于Robots.txt 对类别 URL 的限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆