Heroku和Web刮 [英] Heroku and Web scraping

查看：144 发布时间：2018/6/7 11:09:09 ruby web-services api heroku sinatra

本文介绍了Heroku和Web刮的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个nokigiri网络抓取工具，发布到我正在尝试发布到heroku的数据库中。我有一个我想从数据库中提取的sinatra应用程序前端。我是Heroku和Web开发新手，并不知道处理这种事情的最佳方式。

我必须放置上传的网页抓取脚本到sinatra路线下的数据库（如mywebsite.com/scraper），并使其变得如此晦涩以至于没有人访问它？最后，我想让sinatra部分成为一个从数据库中提取出来的休息API。

感谢您的所有输入
您可以采取两种方法。

第一种方法是使用一次性dynos通过使用 heroku运行YOURCMD 来通过控制台运行scraper。请确保scraper不写入磁盘，但使用数据库。

第二种方式是区分刮板和Web过程，以便您具有用于正常UI交互的Web过程以及Web过程可以产生/交谈的刮板过程。如果你采取这种方式，它取决于你如何保护它免受世界其他地区的侵害（auth / url混淆等）。

更多信息：
< a href =https://devcenter.heroku.com/articles/background-jobs-queueing =nofollow> https://devcenter.heroku.com/articles/background-jobs-queueing

I have a nokigiri web scraper that publishes to a database that I'm trying to publish to heroku. I have a sinatra application frontend that I want to have pull in from the database. I'm new to Heroku and web development, and don't know the best way to handle something like this.

Do I have to place the web scraper script that uploads to the database under a sinatra route (like mywebsite.com/scraper ) and just make it so obscure that no one visits it? In the end, I'd like to have the sinatra part be a rest api that pulls from the database.

Thanks for all input

解决方案

There are two approaches you can take.

The first one is to use One-off dynos by running the scraper through the console using heroku run YOURCMD. Just make sure scraper don't write to disk but uses database.

More information: https://devcenter.heroku.com/articles/one-off-dynos

The second is differentiating between scraper and web process in a way that you have web process for normal UI interaction and a scraper process which web process can spawn/talk to. If you take this route it's up to you how to protect it from rest of the world (auth/url obfuscation etc.).

More information: https://devcenter.heroku.com/articles/background-jobs-queueing

这篇关于Heroku和Web刮的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Heroku和Web刮 [英] Heroku and Web scraping

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Heroku和Web刮 [英] Heroku and Web scraping

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭