如何在Heroku云上部署Scrapy蜘蛛 [英] How to deploy a Scrapy spider on Heroku cloud

查看：232 发布时间：2018/6/7 10:26:35 python python-2.7 heroku scrapy

本文介绍了如何在Heroku云上部署Scrapy蜘蛛的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在scrapy&公司开发了几只蜘蛛。我想测试那些Heroku云。有人知道如何在Heroku云上部署Scrapy蜘蛛吗？

解决方案

是的，部署和运行相当简单您在Heroku上的Scrapy蜘蛛。

以下是以真正的Scrapy项目为例的步骤：

克隆项目（注意它必须有一个 requirements.txt 文件让Heroku识别它为Python项目）：

git clone https://github.com/scrapinghub/testspiders.git

将cffi添加到require.txt文件（例如cffi == 1.1.0）。
创建Heroku应用程序添加一个新的heroku git remote）：
$ b $ heroku create

部署项目（第一次创建slug时需要一段时间）：

git push heroku master $

> heroku运行scrapy抓取followall $ b ul> Heroku磁盘是短暂的。如果您想将抓取的数据存储在持久的位置，则可以使用 S3 feed export （通过附加 -o s3：//mybucket/items.jl ）或使用插件（如MongoHQ或Redis To Go）并编写一个管道来存储您的项目在Heroku上运行Scrapyd服务器将会很酷，但目前不可能，因为 sqlite3 模块（Scrapyd需要）在Heroku上无法使用如果您想要一个更复杂的解决方案来部署Scrapy蜘蛛，可以考虑设置自己的 Scrapyd服务器或使用托管服务，如 Scrapy Cloud I developed few spiders in scrapy & I want to test those on Heroku cloud. Does anybody have any idea about how to deploy a Scrapy spider on Heroku cloud? 解决方案 Yes, it's fairly simple to deploy and run your Scrapy spider on Heroku. Here are the steps using a real Scrapy project as example: Clone the project (note that it must have a requirements.txt file for Heroku to recognize it as a Python project): git clone https://github.com/scrapinghub/testspiders.git Add cffi to the requirement.txt file (e.g. cffi==1.1.0). Create the Heroku application (this will add a new heroku git remote): heroku create Deploy the project (this will take a while the first time, when the slug is built): git push heroku master Run your spider: heroku run scrapy crawl followall Some notes: Heroku disk is ephemeral. If you want to store the scraped data in a persistent place, you can use a S3 feed export (by appending -o s3://mybucket/items.jl) or use an addon (like MongoHQ or Redis To Go) and write a pipeline to store your items there It would be cool to run a Scrapyd server on Heroku, but it's not currently possible because the sqlite3 module (which Scrapyd requires) doesn't work on Heroku If you want a more sophisticated solution for deploying your Scrapy spiders, consider setting up your own Scrapyd server or using a hosted service like Scrapy Cloud 这篇关于如何在Heroku云上部署Scrapy蜘蛛的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

如何在Heroku云上部署Scrapy蜘蛛 [英] How to deploy a Scrapy spider on Heroku cloud

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何在Heroku云上部署Scrapy蜘蛛 [英] How to deploy a Scrapy spider on Heroku cloud

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭