首页
其他开发
robots.txt 中的常用规则

robots.txt 中的常用规则 [英] Common rule in robots.txt

查看：57 发布时间：2021/7/10 19:19:48 robots.txt

本文介绍了robots.txt 中的常用规则的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何使用 robots.txt<禁止像 1.html, 2.html, ..., [0-9]+.html(就正则表达式而言)这样的 URL/代码>?


How can I disallow URLs like 1.html, 2.html, ..., [0-9]+.html (in terms of regexp) with robots.txt?
推荐答案
原始 robots.txt 规范不支持正则表达式/通配符.但是，您可以屏蔽如下网址:
The original robots.txt specification doesn't support regex/wildcards. However, you could block URLs like these:
example.com/1.html
example.com/2367123.html
example.com/3
example.com/4/foo
example.com/5/1
example.com/6/
example.com/7.txt
example.com/883
example.com/9to5
…
与:
User-agent: *
Disallow: /0
Disallow: /1
Disallow: /2
Disallow: /3
Disallow: /4
Disallow: /5
Disallow: /6
Disallow: /7
Disallow: /8
Disallow: /9

如果您只想阻止以单个数字开头且后跟 .html 的 URL，只需附加 .html，例如:
If you want to block only URLs starting with a single numeral followed by .html, just append .html, like:
User-agent: *
Disallow: /0.html
Disallow: /1.html
…

但是，这不会阻塞，例如，example.com/12.html

                        这篇关于robots.txt 中的常用规则的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文


        
            



        
        
            相关文章
            
                    
                        
                            动态robots.txt;
                        
                    
                    
                        
                            Robots.txt 文件;
                        
                    
                    
                        
                            Noindex in a robots.txt;
                        
                    
                    
                        
                            robots.txt 中的通配符;
                        
                    
                    
                        
                            Robots.txt:这个通配符规则有效吗?;
                        
                    
                    
                        
                            robots.txt 网址格式;
                        
                    
                    
                        
                            Googlebots 忽略 robots.txt?;
                        
                    
                    
                        
                            Scrapy and respect of robots.txt;
                        
                    
                    
                        
                            Robots.txt priority question;
                        
                    
                    
                        
                            多个域名的robots.txt;
                        
                    
                    
                        
                            在python中解析Robots.txt;
                        
                    
                    
                        
                            django有效提供robots.txt;
                        
                    
                    
                        
                            被 robots.txt 禁止:scrapy;
                        
                    
                    
                        
                            网页抓取和 robots.txt;
                        
                    
                    
                        
                            robots.txt - 这有效吗?;
                        
                    
                    
                        
                            Robots.txt 拒绝，因为 #!网址;
                        
                    
                    
                        
                            Meta标签与robots.txt;
                        
                    
                    
                        
                            Google Site Map robots.txt;
                        
                    
                    
                        
                            Python请求与robots.txt;
                        
                    
                    
                        
                            Ruby on Rails robots.txt folders;
                        
                    
                    
                        
                            Angular 8 sitemap and robots.txt;
                        
                    
                    
                        
                            Robots.txt 中的多个用户代理;
                        
                    
                    
                        
                            使用“禁止:/*?"在 robots.txt 文件中;
                        
                    
                    
                        
                            Robots.txt:只允许主要 SE;
                        
                    
                    
                        
                            Rails robots.txt 文件夹;


    
        
            其他开发最新文章
            
                    
                        
                            拒绝显示一个框架，因为它将'X-Frame-Options'设置为'sameorigin';
                        
                    
                    
                        
                            什么是＆QUOT; AW＆QUOT;在部分标志属性是什么意思？;
                        
                    
                    
                        
                            在运行npm install命令时获取'npm WARN弃用'警告;
                        
                    
                    
                        
                            cmake无法找到openssl;
                        
                    
                    
                        
                            从Spark的scala中的* .tar.gz压缩文件中读取HDF5文件;
                        
                    
                    
                        
                            Twitter :: Error :: Forbidden  - 无法验证您的凭据;
                        
                    
                    
                        
                            我什么时候需要一个fb：app_id或者fb：admins？;
                        
                    
                    
                        
                            将.db文件导入R;
                        
                    
                    
                        
                            npm通知创建一个lockfile作为package-lock.json。你应该提交这个文件;
                        
                    
                    
                        
                            拒绝执行内联脚本，因为它违反了以下内容安全策略指令：“script-src'self'”;
                        
                    
            
        
        
            
                热门教程
            
            
                
                    
                        Java教程
                    
                
                
                    
                        Apache ANT 教程
                    
                
                
                    
                        Kali Linux教程
                    
                
                
                    
                        JavaScript教程
                    
                
                
                    
                        JavaFx教程
                    
                
                
                    
                        MFC 教程
                    
                
                
                    
                        Apache HTTP客户端教程
                    
                
                
                    
                        Microsoft Visio 教程
                    
                
            
        
        
            
                热门工具
            
            
                
                
                    
                        Java 在线工具
                    
                
                
                    
                        C(GCC) 在线工具
                    
                
                
                    
                        PHP 在线工具
                    
                
                
                    
                        C# 在线工具
                    
                
                
                    
                        Python 在线工具
                    
                
                
                    
                        MySQL 在线工具
                    
                
                
                    
                        VB.NET 在线工具
                    
                
                
                    
                        Lua 在线工具
                    
                
                
                    
                        Oracle 在线工具
                    
                
                
                    
                        C++(GCC) 在线工具
                    
                
                
                    
                        Go 在线工具
                    
                
                
                    
                        Fortran 在线工具



    
        
            登录
            关闭
        
        
            
                扫码关注1秒登录
            
            
                
            
            
                
                
            
            
                发送“验证码”获取
                |
                15天全站免登陆
            
            
        
    
    





    
		
			友情链接：
            IT屋
            Chrome插件
            谷歌浏览器插件
        
        
            IT屋
            ©2016-2022 琼ICP备2021000895号-1
            站点地图
            站点标签
            SiteMap
            <免责申明>
            本站内容来源互联网,如果侵犯您的权益请联系我们删除.