如何使用 asyncio 在 Python 3 中异步运行 requests.get? [英] How to run requests.get asynchronously in Python 3 using asyncio?
问题描述
我正在尝试创建简单的 Web 监控脚本,该脚本定期和异步地向列表中的 url 发送 GET 请求.这是我的请求函数:
def request(url,timeout=10):尝试:响应 = requests.get(url,timeout=timeout)response_time = response.elapsed.total_seconds()如果 response.status_code 在 (404,500):response.raise_for_status()html_response = response.text汤 = BeautifulSoup(html_response,'lxml')# 处理页面在这里logger.info("OK {}.响应时间:{} 秒".format(url,response_time))除了 requests.exceptions.ConnectionError:logger.error('连接错误.{} 已关闭.响应时间:{} 秒'.format(url,response_time))除了 requests.exceptions.Timeout:logger.error('超时.{} 没有响应.响应时间:{} 秒'.format(url,response_time))除了 requests.exceptions.HTTPError:logger.error('HTTP 错误.{} 返回状态码 {}.响应时间:{} 秒'.format(url,response.status_code, response_time))除了 requests.exceptions.TooManyRedirects:logger.error('{} 的重定向过多.响应时间:{} 秒'.format(url,response_time))除了:logger.error('未找到 {} 的内容要求.响应时间:{} 秒'.format(url,response_time))
在这里我为所有网址调用这个函数:
def async_requests(delay,urls):对于网址中的网址:async_task = make_async(request,delay,url,10)loop.call_soon(延迟,异步任务)尝试:loop.run_forever()最后:循环关闭()
delay
参数是循环的间隔,它描述了函数需要执行的频率.为了循环 request
我创建了这样的东西:
def make_async(func,delay,*args,**kwargs):def 包装器(*args, **kwargs):func(*args, **kwargs)loop.call_soon(延迟,包装器)返回包装器
每次执行 async_requests
时,每个 url 都会出现此错误:
回调 1.0 中的异常()句柄:<句柄 1.0(<function mak...x7f1d48dd1730>)>回溯(最近一次调用最后一次):文件/usr/lib/python3.5/asyncio/events.py",第 125 行,在 _runself._callback(*self._args)类型错误:浮动"对象不可调用
此外,每个 url 的 request
函数也没有按预期定期执行.我的打印函数也没有执行 async_requests
之后:
async_requests(args.delay,urls)打印(开始...")
我知道我在代码中做错了,但我不知道如何解决这个问题.我是 python 初学者,对 asyncio 不是很有经验.总结我想要达到的目标:
- 异步和周期性地
request
为特定的 url 运行而不阻塞主线程. - 异步运行
async_requests
这样我就可以启动一个简单的 http 服务器例如在同一线程中.
除了:
它还会捕获服务异常行 KeyboardInterrupt
或 StopIteration
.永远不要做这样的事情.而是写:
除了例外:
<小时><块引用>
如何使用 asyncio 在 Python 3 中异步运行 requests.get?
requests.get
本质上是阻塞的.
您应该为 aiohttp
模块等请求找到异步替代方案:
async def get(url):与 aiohttp.ClientSession() 作为会话异步:与 session.get(url) 作为响应异步:返回 await resp.text()
或在单独的线程中运行 requests.get
并使用 loop.run_in_executor
等待此线程异步:
executor = ThreadPoolExecutor(2)异步 def get(url):response = await loop.run_in_executor(executor, requests.get, url)返回 response.text
I'm trying to create simple web monitoring script which sends GET request to urls in list periodically and asynchronously. Here is my request function:
def request(url,timeout=10):
try:
response = requests.get(url,timeout=timeout)
response_time = response.elapsed.total_seconds()
if response.status_code in (404,500):
response.raise_for_status()
html_response = response.text
soup = BeautifulSoup(html_response,'lxml')
# process page here
logger.info("OK {}. Response time: {} seconds".format(url,response_time))
except requests.exceptions.ConnectionError:
logger.error('Connection error. {} is down. Response time: {} seconds'.format(url,response_time))
except requests.exceptions.Timeout:
logger.error('Timeout. {} not responding. Response time: {} seconds'.format(url,response_time))
except requests.exceptions.HTTPError:
logger.error('HTTP Error. {} returned status code {}. Response time: {} seconds'.format(url,response.status_code, response_time))
except requests.exceptions.TooManyRedirects:
logger.error('Too many redirects for {}. Response time: {} seconds'.format(url,response_time))
except:
logger.error('Content requirement not found for {}. Response time: {} seconds'.format(url,response_time))
And here where I call this function for all urls:
def async_requests(delay,urls):
for url in urls:
async_task = make_async(request,delay,url,10)
loop.call_soon(delay,async_task)
try:
loop.run_forever()
finally:
loop.close()
delay
argument is interval for loop which describes how often function needs to be executed. In order to loop request
I created something like this:
def make_async(func,delay,*args,**kwargs):
def wrapper(*args, **kwargs):
func(*args, **kwargs)
loop.call_soon(delay, wrapper)
return wrapper
every time I execute async_requests
I get this error for each url:
Exception in callback 1.0(<function mak...x7f1d48dd1730>)
handle: <Handle 1.0(<function mak...x7f1d48dd1730>)>
Traceback (most recent call last):
File "/usr/lib/python3.5/asyncio/events.py", line 125, in _run
self._callback(*self._args)
TypeError: 'float' object is not callable
Also request
functions for each urls are not being executed periodically as intended. Also my print function which goes after async_requests
is not executed either:
async_requests(args.delay,urls)
print("Starting...")
I understand that I'm doing something wrong in code but I can't figure how to solve this problem. I'm beginner in python and not very experienced with asyncio. Summarizing what I want to achive:
- Run asynchronously and periodcally
request
for particular url without blocking main thread. - Run
async_requests
asynchronously so I could launch a simple http server for example in same thread.
except:
It'll catch also service exceptions line KeyboardInterrupt
or StopIteration
. Never do such thing. Instead write:
except Exception:
How to run requests.get asynchronously in Python 3 using asyncio?
requests.get
is blocking by nature.
You should either find async alternative for requests like aiohttp
module:
async def get(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as resp:
return await resp.text()
or run requests.get
in separate thread and await this thread asynchronicity using loop.run_in_executor
:
executor = ThreadPoolExecutor(2)
async def get(url):
response = await loop.run_in_executor(executor, requests.get, url)
return response.text
这篇关于如何使用 asyncio 在 Python 3 中异步运行 requests.get?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!