如何读取通过追加行连续更新的文件? [英] How to read a file that is continuously being updated by appending lines?

查看:163
本文介绍了如何读取通过追加行连续更新的文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的终端中,我正在运行:

In my terminal I am running:

curl --user dhelm:12345 \https://stream.twitter.com/1.1/statuses/sample.json > raw-data.txt

curl的输出是直播流数据,正在写入文件raw -data.txt

curl's output is live streaming Twitter data which is being written on to a file raw-data.txt

在python中,

 import json
 posts = []

 for line in open("/Users/me/raw-data.txt"):
    try:
        posts.append(json.loads(line))
    except:
        pass

python和使用json解码器,并将结果附加到帖子。

I am reading the file in python and using json decoder and appending the results to posts.

现在,问题是我不希望我的程序结束,当python脚本到达文件结尾。

Now, the issue is I don't want my program to end when the python script reaches the end of file. instead I want to continue reading when the curl running on my terminal appends more posts to the file raw-data.txt.

推荐答案

我想要继续阅读,当我的终端上运行的curl会添加更多的帖子到文件raw-data.txt。我认为这是一个 XY问题。因为你不能想象一种从Python内逐行流式传输HTTP请求的方式,所以你决定使用 curl 做一个流式下载到一个文件,然后从Python中读取该文件。因为你这样做,你必须处理在请求仍然进行时运行到EOF的可能性,只是因为你已经赶上 curl

I think this is an XY problem. Because you couldn't think of a way to stream an HTTP request line by line from within Python, you decided to use curl to do a streaming download to a file, and then read that file from within Python. Because you did that, you have to deal with the possibility of running into EOF while the request is still going, just because you've caught up to curl. So you're making things harder on yourself for no reason.

虽然使用stdlib可以进行下载,但是有点痛苦; 请求 库使它更容易。所以,让我们使用:

While streaming downloads can be done with the stdlib, it's a bit painful; the requests library makes it a lot easier. So, let's use that:

import json
import requests
from requests.auth import HTTPBasicAuth

posts = []
url = 'https://stream.twitter.com/1.1/statuses/sample.json'
r = requests.get(url, auth=('dhelm', '12345'), stream=True)
for line in r.iter_lines():
    try:
        posts.append(json.loads(line))
    except:
        pass

这是整个程序。

这篇关于如何读取通过追加行连续更新的文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆