与Flask并行运行URL请求 [英] Running URL Requests in Parallel with Flask

查看:201
本文介绍了与Flask并行运行URL请求的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

asyncio对我来说还是一个相对较新的东西。

asyncio is still relatively new for me.

我从基础开始-简单的HTTP hello世界-仅发出约40个并行GET请求并获取前400个使用Flask的HTTP响应的字符(并行功能由请求调用)。

I am starting with the basics - simple HTTP hello world - just making approximately 40 parallel GET requests and fetching the first 400 characters of the HTTP responses using Flask ("parallel" function is invoked by request).

它在python 3.7上运行。

It is running on python 3.7.

Traceback显示了我不理解的错误。这是指哪个构造函数参数应为str?我应该如何进行?

The Traceback is showing errors I don't understand. Which "Constructor parameter should be str" is this referring to? How should I proceed?

这是应用程序的全部代码:

This is the entire code of the app:

import aiohttp
import asyncio
import json

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()

async def main():
    global urls
    tasks = []
    async with aiohttp.ClientSession() as session:
        for url in urls:
            tasks.append(fetch(session, url))
        htmls = await asyncio.gather(*tasks)
        returnstring = ""
        for html in htmls:
            returnstring += html + ","
            print(html[:400])
        return returnstring


def parallel(request):
    global urls
    urls = []
    request_json = request.get_json()
    if request_json and 'urls' in request_json:
        urls = request_json['urls']
        print(urls)

    loop = asyncio.get_event_loop()
    return loop.run_until_complete(main())

跟踪显示错误:

Traceback (most recent call last):
  File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 346, in run_http_function
    result = _function_handler.invoke_user_function(flask.request)
  File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 217, in invoke_user_function
    return call_user_function(request_or_event)
  File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 210, in call_user_function
    return self._user_function(request_or_event)
  File "/user_code/main.py", line 57, in parallel
    return loop.run_until_complete(main())
  File "/opt/python3.7/lib/python3.7/asyncio/base_events.py", line 573, in run_until_complete
    return future.result()
  File "/user_code/main.py", line 15, in main
    htmls = await asyncio.gather(*tasks)
  File "/user_code/main.py", line 6, in fetch
    async with session.get(url) as response:
  File "/env/local/lib/python3.7/site-packages/aiohttp/client.py", line 1012, in __aenter__
    self._resp = await self._coro
  File "/env/local/lib/python3.7/site-packages/aiohttp/client.py", line 380, in _request
    url = URL(str_or_url)
  File "/env/local/lib/python3.7/site-packages/yarl/__init__.py", line 149, in __new__
    raise TypeError("Constructor parameter should be str")
TypeError: Constructor parameter should be str


推荐答案

我测试过:如果我使用了其他东西,然后使用字符串(即。

I tested: if I use something different then string (ie. tuple/list) in

session.get( (url, something) ) 

然后我得到你的错误。因此,您的网址中的数据有误。

then I get your error. So you have wrong data in urls.

我用来测试的代码:

import aiohttp
import asyncio

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()

async def main(urls):
    tasks = []
    results = []
    async with aiohttp.ClientSession() as session:
        for url in urls:
            tasks.append(fetch(session, url))
        results = await asyncio.gather(*tasks)
    return results

def parallel(urls):
    loop = asyncio.get_event_loop()
    results = loop.run_until_complete(main(urls))
    return results

# --- main ---

urls = [
    #('https://stackoverflow.com/', 1), # TypeError: Constructor parameter should be str
    'https://stackoverflow.com/',
    'https://httpbin.org/',
    'http://toscrape.com/',
]

result = parallel(urls)

for item in result:
    print(item[:300])
    print('-----')






我不知道你得到什么 request_json [ 'urls'] ,但您应该只获得


I don't know what you get request_json['urls'] but you should get only urls

 urls = request_json['urls']
 urls = [ ??? for x in urls] # in place `???` use code which get only url from `x`

这篇关于与Flask并行运行URL请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆