禁止在 robots.txt 中使用动态网址 [英] Disallow dynamic URL in robots.txt

查看：41 发布时间：2021/7/10 19:17:58 robots.txt

本文介绍了禁止在 robots.txt 中使用动态网址的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我们的网址是:

http://example.com/kitchen-knife/collection/maitre-universal-cutting-boards-rana-parsley-chopper-cheese-slicer-vegetables-knife-sharpening-stone-ham-stand-ham-stand-riviera-niza-knives-block-benin.html

我想禁止在 collection 之后抓取 URL，但在 collection 之前有一些类别是动态来的.

I want to disallow URLs to be crawled after collection, but before collection there are categories that are dynamically coming.

如何在 /collection 之后禁止 robots.txt 中的 URL?

How would I disallow URLs in robots.txt after /collection?

推荐答案

这在原始 robots.txt 规范中是不可能的.

This is not possible in the original robots.txt specification.

但是一些 (!) 解析器扩展了规范并定义了通配符(通常是 *).

But some (!) parsers extend the specification and define a wildcard character (typically *).

对于那些解析器，您可以使用:

For those parsers, you could use:

Disallow: /*/collection

将* 理解为通配符的解析器将停止抓取路径以anything(可能是nothing)开头的任何URL，后跟<代码>/collection/，然后是任何东西，例如

Parsers that understand * as wildcard will stop crawling any URL whose path starts with anything (which may be nothing), followed by /collection/, followed by anything, e.g.,

http://example.com/foo/collection/
http://example.com/foo/collection/bar
http://example.com/collection/

不将 * 理解为通配符(即，它们遵循原始规范)的解析器将停止抓取路径以 /*/collection/ 开头的任何 URL，例如

Parsers that don’t understand * as wildcard (i.e., they follow the original specification) will stop crawling any URL whose paths starts with /*/collection/, e.g.

http://example.com/*/collection/
http://example.com/*/collection/bar

这篇关于禁止在 robots.txt 中使用动态网址的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

禁止在 robots.txt 中使用动态网址 [英] Disallow dynamic URL in robots.txt

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

禁止在 robots.txt 中使用动态网址 [英] Disallow dynamic URL in robots.txt

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭