Python从API请求流数据 [英] Python Requests Stream Data from API

查看:278
本文介绍了Python从API请求流数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

用例::我正在尝试连接到流式API,提取这些事件,对其进行过滤并保存相关事件.

Use Case: I am trying to connect to a streaming API, ingest those events, filter them and save relevant ones.

问题:我的代码在第1100次响应之前效果良好.在此之后,代码不会崩溃,但似乎停止从流中提取更多数据.我猜这是某种缓冲区问题,但是说实话,流式传输对我来说是新的,我不知道是什么导致了该问题.

Issue: My code works well until about 1100th response. After this point the code doesn't crash but it seems to stop pulling more data from the stream. I am guessing it is some sort of buffer issue, but honestly streaming is new to me and I have no idea what is causing the issue.

代码

import requests
def stream():
    s = requests.Session()
    r = s.get(url, headers=headers, stream=True)
    for line in r.iter_lines():
        if line:
            print(line)

我也尝试了不使用会话对象的方法,并且得到了相同的结果.

I have also tried this without a session object and I get the same results.

是否存在我要忽略的参数或我不知道的概念?我已经搜索了docs/interwebs,但没有任何东西对我有利.

Is there a parameter I am overlooking or a concept I am not aware of? I have scoured the docs/interwebs and nothing is jumping out at me.

我们非常感谢您的帮助.

Any help is much appreciated.

编辑在我看来,一切看起来都是正确的,我认为流在初始连接时只会生成大量事件,然后它们会逐渐变慢.但是现在的问题是,连接几分钟后,我得到了这个错误:

EDIT Everything looks correct on my end I think that the stream just generates a ton of events upon initial connection, then they slow way down. The issue now however, is that after just a few minutes connected I am getting this error:

Traceback (most recent call last):
  File "C:\Users\joe\PycharmProjects\proj\venv\lib\site-packages\urllib3\response.py", line 572, in _update_chunk_length
    self.chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: b''

推荐答案

按照( requests 库)部分指南,用于流式传输数据.

Follow the "Body Content Workflow" (requests library) section guidlines for streaming data.

示例方法:

import requests

def get_stream(url):
    s = requests.Session()

    with s.get(url, headers=None, stream=True) as resp:
        for line in resp.iter_lines():
            if line:
                print(line)

url = 'https://jsonplaceholder.typicode.com/posts/1'
get_stream(url)

输出:

b'{'
b'  "userId": 1,'
b'  "id": 1,'
b'  "title": "sunt aut facere repellat provident occaecati excepturi optio reprehenderit",'
b'  "body": "quia et suscipit\\nsuscipit recusandae consequuntur expedita et cum\\nreprehenderit molestiae ut ut quas totam\\nnostrum rerum est autem sunt rem eveniet architecto"'
b'}'

这篇关于Python从API请求流数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆