在robots.txt的子目录 [英] robots.txt in subdirectory

查看:149
本文介绍了在robots.txt的子目录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个位于主域名下面的文件夹中的一个项目,而我没有访问域本身的根源。

  http://mydomain.com/myproject/

我希望禁止在子文件夹的索引禁止

  http://mydomain.com/myproject/forbidden/

我可以简单地说了robots.txt的文件夹的myproject?如果在根没有的robots.txt将它得到读什?

什么是不允许的禁文件夹的正确语法?

 用户代理:*
禁止:/禁止/

 用户代理:*
禁止:禁止/


解决方案

robotstxt.org :


  

在哪里把它放在


  
  

答案很简单:在顶层
  Web服务器的目录。


  
  

较长的答案是:


  
  

当机器人查找
  /robots.txt的文件URL,它去除
  从URL路径组件
  (一切从第一单
  斜线),并提出/robots.txt的在其
  的地方。


  
  

例如,对于
  http://www.example.com/shop/index.html,
  它会删除/shop/index.html
  并用/robots.txt的取代它,
  将结束
  http://www.example.com/robots.txt。


  
  

所以,作为一个网站的所有者需要
  把它放在正确的地方你的网页上
  服务器为结果URL工作。
  通常这是同一个地方,
  你把你的网站的主要
  index.html的欢迎页面。哪里
  正是那就是,如何把
  文件存在,取决于您的Web服务器
  软件。


  
  

记住使用全部小写的
  文件名:为robots.txt,而不是
  robots.txt的。


所以,恐怕答案是您必须把它的根文件夹: - (

至于你的第二个问题,我认为正确的语法是一个开始以正斜杠(如 /禁止/ )。

I have a project that lies in a folder below the main domain, and I dont have access to the root of the domain itself.

http://mydomain.com/myproject/

I want to disallow indexing on the subfolder "forbidden"

http://mydomain.com/myproject/forbidden/

Can I simply put a robots.txt in the myproject folder? Will it get read even if there is no robots.txt in the root?

What is the correct syntax for disallowing the forbidden folder?

User-agent: *
Disallow: /forbidden/

or

User-agent: *
Disallow: forbidden/

解决方案

From robotstxt.org:

Where to put it

The short answer: in the top-level directory of your web server.

The longer answer:

When a robot looks for the "/robots.txt" file for URL, it strips the path component from the URL (everything from the first single slash), and puts "/robots.txt" in its place.

For example, for "http://www.example.com/shop/index.html, it will remove the "/shop/index.html", and replace it with "/robots.txt", and will end up with "http://www.example.com/robots.txt".

So, as a web site owner you need to put it in the right place on your web server for that resulting URL to work. Usually that is the same place where you put your web site's main "index.html" welcome page. Where exactly that is, and how to put the file there, depends on your web server software.

Remember to use all lower case for the filename: "robots.txt", not "Robots.TXT.

So I'm afraid the answer is that you have to put it in the root folder :-(

With regards to your second question, I believe the correct syntax is the one starting with a forward slash (eg. /forbidden/).

这篇关于在robots.txt的子目录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆