在云服务器中运行python脚本的最简单方法是什么? [英] What is the easiest way to run python scripts in a cloud server?

查看:545
本文介绍了在云服务器中运行python脚本的最简单方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Web爬行python脚本,需要花费数小时才能完成,并且无法在本地计算机上完全运行。是否有方便的方法将此部署到简单的Web服务器?该脚本基本上将网页下载到文本文件中。如何最好地做到这一点?
谢谢!

I have a web crawling python script that takes hours to complete, and is infeasible to run in its entirety on my local machine. Is there a convenient way to deploy this to a simple web server? The script basically downloads webpages into text files. How would this be best accomplished? Thanks!

推荐答案

由于您说性能是一个问题,并且您正在进行网络抓取,因此第一件事try是一个 Scrapy 框架-这是一个非常快速简便的框架使用网络抓取框架。 scrapyd 工具将允许您分发爬网-您可以在不同的服务器上运行多个 scrapyd 服务,并在每个服务器之间分配负载。请参阅:

Since you said that performance is a problem and you are doing web-scraping, first thing to try is a Scrapy framework - it is a very fast and easy to use web-scraping framework. scrapyd tool would allow you to distribute the crawling - you can have multiple scrapyd services running on different servers and split the load between each. See:

  • Distributed crawls
  • Running Scrapy on Amazon EC2

还有一个 Scrapy Cloud 服务在那里:

There is also a Scrapy Cloud service out there:


Scrapy Cloud将高效的Scrapy开发
环境与功能强大,功能齐全的生产环境桥接在一起,以
部署和运行爬网。就像Scrapy的Heroku一样,尽管在不久的将来将支持
其他技术。它运行在Scrapinghub平台的
顶部,这意味着您的项目可以根据需要按
的需求进行扩展。

Scrapy Cloud bridges the highly efficient Scrapy development environment with a robust, fully-featured production environment to deploy and run your crawls. It's like a Heroku for Scrapy, although other technologies will be supported in the near future. It runs on top of the Scrapinghub platform, which means your project can scale on demand, as needed.

这篇关于在云服务器中运行python脚本的最简单方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆