用 PHP 处理大型 JSON 文件 [英] Processing large JSON files in PHP
问题描述
我正在尝试处理有点大(可能高达 200M)的 JSON 文件.文件的结构基本上是一个对象数组.
大致如下:
<预><代码>[{"property":"value", "property2":"value2"},{"prop":"val"},...{"foo":"bar"}]每个对象都具有任意属性,并且不必与数组中的其他对象共享它们(例如,具有相同的属性).
我想对数组中的每个对象进行处理,由于文件可能很大,我无法在内存中提取整个文件内容、解码 JSON 并遍历 PHP 数组.
所以理想情况下,我想读取文件,为每个对象获取足够的信息并对其进行处理.如果有一个类似的可用于 JSON 的库,那么 SAX 类型的方法就可以了.
对于如何最好地处理这个问题有什么建议吗?
我决定研究基于事件的解析器.还没有完全完成,当我推出令人满意的版本时,我会编辑问题并附上指向我的作品的链接.
我终于制定了一个我满意的解析器版本.它可以在 GitHub 上找到:
https://github.com/kuma-giyomu/JSONParser
可能有一些改进的空间,欢迎反馈.
I am trying to process somewhat large (possibly up to 200M) JSON files. The structure of the file is basically an array of objects.
So something along the lines of:
[
{"property":"value", "property2":"value2"},
{"prop":"val"},
...
{"foo":"bar"}
]
Each object has arbitrary properties and does not necessary share them with other objects in the array (as in, having the same).
I want to apply a processing on each object in the array and as the file is potentially huge, I cannot slurp the whole file content in memory, decoding the JSON and iterating over the PHP array.
So ideally I would like to read the file, fetch enough info for each object and process it. A SAX-type approach would be OK if there was a similar library available for JSON.
Any suggestion on how to deal with this problem best?
I decided on working on an event based parser. It's not quite done yet and will edit the question with a link to my work when I roll out a satisfying version.
EDIT:
I finally worked out a version of the parser that I am satisfied with. It's available on GitHub:
https://github.com/kuma-giyomu/JSONParser
There's probably room for some improvement and am welcoming feedback.
这篇关于用 PHP 处理大型 JSON 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!