使用 Python (stream twitter) 将多个 JSON 文件合并为一个文件 [英] Merge multiple JSON files into one file by using Python (stream twitter)
问题描述
我从 Twitter 中提取了数据.目前,数据在多个文件中,我无法将其合并为一个文件.
注意:所有文件均为 JSON 格式.
我在一些关于使用 Python 合并 JSON 的教程中看到了这个代码
from glob 导入 glob导入json将熊猫导入为 pdwith open('Desktop/json/finalmerge.json', 'w') as f:for fname in glob('Desktop/json/*.json'): # 从当前目录读取所有 json将 open(fname) 设为 j:f.write(str(j.read()))f.write('\n')
我成功合并了所有文件,现在文件是 finalmerge.json.
现在我按照几个线程的建议使用了这个:
<预><代码>df_lines = pd.read_json('finalmerge.json',lines=True)df_lines1000000*23 列那么,我应该怎么做才能在单独的列中制作每个功能?我不确定为什么 JSON 文件有什么问题,我检查了我合并的文件,发现它作为 JSON 文件无效?我应该怎么做才能将其作为数据框?我问这个问题的原因是我有非常基本的 Python 知识,而且我发现的类似问题的所有答案都比我能理解的要复杂得多.请帮助这个新的 python 用户将多个 Json 文件转换为一个 JSON 文件.谢谢我认为问题在于您的文件并不是真正的 json(或者更好的是,它们的结构为 jsonl ).您有两种处理方式:
- 您可以将每个文件作为文本文件读取并逐行合并
- 您可以将它们转换为 json(在文件的开头添加一个方括号,并在每个 json 元素的末尾添加一个逗号).
尝试关注这个问题,让我知道它是否解决了您的问题:加载 JSONL 文件作为 JSON 对象
您也可以尝试以这种方式编辑您的代码:
with open('finalmerge.json', 'w') as f:对于 glob('Desktop/json/*.json') 中的 fname:将 open(fname) 设为 j:f.write(str(j.read()))f.write('\n')
每一行都是不同的 json 元素.
I've pulled data from Twitter. Currently, the data is in multiple files and I could not merge it into one single file.
Note: all files are in JSON format.
The code I have used is here and here.
It has been suggested to work with glop
to compile JSON files
I write this code as I have seen in some tutorials about merge JSON by using Python
from glob import glob
import json
import pandas as pd
with open('Desktop/json/finalmerge.json', 'w') as f:
for fname in glob('Desktop/json/*.json'): # Reads all json from the current directory
with open(fname) as j:
f.write(str(j.read()))
f.write('\n')
I successfully merge all files and now the file is finalmerge.json.
Now I used this as suggested in several threads:
df_lines = pd.read_json('finalmerge.json', lines=True)
df_lines
1000000*23 columns
Then, what I should do to make each feature in separate columns?
I'm not sure why what's wrong with JSON files, I checked the file that I merge and I found it's not valid as JSON file? what I should do to make this as a data frame?
The reason I am asking this is that I have very basic python knowledge and all the answers to similar questions that I have found are way more complicated than I can understand. Please help this new python user to convert multiple Json fils to one JSON file.
Thank you
I think that the problem is that your files are not really json (or better, they are structured as jsonl ). You have two ways of proceding:
- you could read every file as a text file and merge them line by line
- you could convert them to json (add a square bracket at the beginning of the file and a comma at the end of every json element).
Try following this question and let me know if it solves your problem: Loading JSONL file as JSON objects
You can also try to edit your code this way:
with open('finalmerge.json', 'w') as f:
for fname in glob('Desktop/json/*.json'):
with open(fname) as j:
f.write(str(j.read()))
f.write('\n')
Every line will be a different json element.
这篇关于使用 Python (stream twitter) 将多个 JSON 文件合并为一个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!