如何使用 Python 搜索互联网? [英] How to search internet with Python?

查看:39
本文介绍了如何使用 Python 搜索互联网?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想编写一个程序来搜索一个相当大的网站并提取某些内容.我有几个在线 Python 课程,但都没有提到如何使用 Python 访问互联网.我不知道我应该从哪里开始.

I want to write a program that searches through a fairly large website and extracts certain things. I've had a couple online Python courses, but neither said anything about how to access the internet with Python. I have no idea where I ought to start with this.

推荐答案

您必须先阅读标准 Python 库 urllib2.

You have first to read about the standard python library urllib2.

一旦您对这个库背后的基本思想感到满意,您就可以尝试 requests这更容易与网络交互,尤其是 API.我建议将它与 httpie 并行使用,以从命令行快速测试查询.

Once you are comfortable with the basic ideas behind this lib you can try requests which is much easier to interact with the web especially APIs. I suggest using it in parallel with httpie to test out queries quick and dirty from command line.

如果您更进一步构建图书馆或引擎来抓取网络,您将需要某种异步编程,我建议从 Gevent

If you go a little further building a librairy or an engine to crawl the web you will need some sort of asynchronous programming, I recommend starting with Gevent

最后,如果你想创建一个爬虫/机器人,你可以看看 Scrapy.但是,在深入研究这个库之前,您应该从基本库开始,因为它可能会变得非常复杂

Finally, if you want to create a crawler/bot you can take a look at Scrapy. You should however start with basic libraries before diving into this one as it can get quite complex

这篇关于如何使用 Python 搜索互联网?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆