在Python 3.5中使用aiohttp提取多个URL [英] Fetching multiple urls with aiohttp in Python 3.5

查看：116 发布时间：2020/6/2 21:15:03 python python-3.x web-scraping python-asyncio aiohttp

本文介绍了在Python 3.5中使用aiohttp提取多个URL的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

自从Python 3.5引入与异步以来， docs 用于 aiohttp 已更改。现在，要获取单个网址，他们建议：

Since Python 3.5 introduced async with the syntax recommended in the docs for aiohttp has changed. Now to get a single url they suggest:

import aiohttp
import asyncio

async def fetch(session, url):
    with aiohttp.Timeout(10):
        async with session.get(url) as response:
            return await response.text()

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    with aiohttp.ClientSession(loop=loop) as session:
        html = loop.run_until_complete(
            fetch(session, 'http://python.org'))
        print(html)

如何修改它以获取一组URL而不是一个URL？

How can I modify this to fetch a collection of urls instead of just one url?

在旧的 asyncio 示例中，您将设置任务列表，例如


In the old asyncio examples you would set up a list of tasks such as
    tasks = [
            fetch(session, 'http://cnn.com'),
            fetch(session, 'http://google.com'),
            fetch(session, 'http://twitter.com')
            ]

我试图将这样的列表与上述方法合并，但失败了。
I tried to combine a list like this with the approach above but failed.
推荐答案
对于并行执行，您需要一个< a href = https://docs.python.org/3/library/asyncio-task.html#task rel = nofollow noreferrer> asyncio.Task  
For parallel execution you need an asyncio.Task
我已将您的示例转换为从多个来源获取并发数据的方法：
I've converted your example to concurrent data fetching from several sources:
import aiohttp
import asyncio

async def fetch(session, url):
    async with session.get(url) as response:
        if response.status != 200:
            response.raise_for_status()
        return await response.text()

async def fetch_all(session, urls):
    tasks = []
    for url in urls:
        task = asyncio.create_task(fetch(session, url))
        tasks.append(task)
    results = await asyncio.gather(*tasks)
    return results

async def main():    
    urls = ['http://cnn.com',
            'http://google.com',
            'http://twitter.com']
    async with aiohttp.ClientSession() as session:
        htmls = await fetch_all(session, urls)
        print(htmls)

if __name__ == '__main__':
    asyncio.run(main())


                        这篇关于在Python 3.5中使用aiohttp提取多个URL的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

在Python 3.5中使用aiohttp提取多个URL [英] Fetching multiple urls with aiohttp in Python 3.5

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在Python 3.5中使用aiohttp提取多个URL [英] Fetching multiple urls with aiohttp in Python 3.5

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭