Heroku上用于登台服务器的不同robots.txt [英] Different robots.txt for staging server on Heroku

查看:122
本文介绍了Heroku上用于登台服务器的不同robots.txt的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



对于抓取工具,我设置了robots.txt文件。




亲爱的网站管理员,您网站的主机名称 https://www.myapp.com/
与SSL证书中的任何使用者名称不符:

* .herokuapp.com

herokuapp.com


Google bot读取我的分段应用程序上的robots.txt并发送此消息。因为我没有设置任何措施来防止抓取工具读取文件。



因此,我在想的是在分段和生产之间更改.gitignore文件,但我无法弄清楚如何做到这一点。



实现这个的最佳实践是什么?



编辑



我搜索了这篇文章,发现这篇文章 http://goo.gl/2ZHal



这篇文章说要设置基本的Rack认证,您不需要关心robots.txt 。

我不知道基本身份验证可以防止谷歌机器人。
似乎这个解决方案更好地处理.gitignore文件。

解决方案

服务 /robots.txt 动态使用控制器动作而不是静态文件?
取决于您允许或不允许搜索引擎索引您的应用程序的环境。


I have staging and production apps on Heroku.

For crawler, I set robots.txt file.

After that I got message from Google.

Dear Webmaster, The host name of your site, https://www.myapp.com/, does not match any of the "Subject Names" in your SSL certificate, which were:
*.herokuapp.com
herokuapp.com

The Google bot read the robots.txt on my staging apps and send this message. because I didn't set anything for preventing crawlers to read the file.

So, what I'm thinking about is to change .gitignore file between staging and production, but I can't figure out how to do this.

What are the best practices for implementing this?

EDIT

I googled about this and found this article http://goo.gl/2ZHal

This article says to set basic Rack authentication and you won't need to care about robots.txt.

I didn't know that basic auth can prevent google bot. It seems this solution is better that manipulate .gitignore file.

解决方案

What about serving /robots.txt dynamically using a controller action instead of having a static file? Depending on the environment you allow or disallow search engines to index your application.

这篇关于Heroku上用于登台服务器的不同robots.txt的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆