写入JSON文件,然后读取相同的文件并获取"JSONDecodeError:额外数据" [英] Writing to JSON file, then reading this same file and getting "JSONDecodeError: Extra data"

查看:77
本文介绍了写入JSON文件,然后读取相同的文件并获取"JSONDecodeError:额外数据"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个非常大的json文件(9GB).我一次从一个对象中读取一个对象,然后当键不在列表fields中时,删除该对象中的键值对.

I have a very large json file (9GB). I'm reading in one object from it at a time, and then deleting key-value pairs in this object when the key is not in the list fields.

每个对象基本上都是某人在求职网站上的用户个人资料,但是它带有许多与我的分析无关的不需要的键值对.其中大约有300万个配置文件.

Each object is basically someone's user profile on a job searching website, but it comes with many unwanted key-value pairs that are not relevant to my analysis. There are about 3 million of these profiles.

我想将每个新的配置文件/对象写入json文件cleaned.json.从本质上讲,这应该是原始json文件的副本,除非fields中未提及的任何键值对已从所有300万个配置文件中删除.

I'd like to write each new profile/object to a json file, cleaned.json. Essentially this should be a copy of the original json file, except any of the key-value pairs not mentioned in fields have been removed from all 3 million profiles.

为此,我编写了以下代码:

To do this, I wrote the following code:

# fields to keep
fields = ["skills", "industry", "summary", "education", "experience"]

with open('cleaned.json', 'w', encoding='UTF8') as f:
        for profile in open(path_to_file, encoding = 'UTF8'):
            profile = json.loads(profile)

            # remove unwanted fields from profile
            for key in list(profile.keys()):
                if key not in fields:
                    del(profile[key])

            # write profile to new json file
            json.dump(profile, f)

要测试它是否有效,我尝试再次读取json文件,如下所示:

To test whether it worked, I tried reading the json file in again, like so:

for foo in open('cleaned.json', encoding='UTF8'):
    foo = json.loads(foo)
    print(json.dumps(foo, indent=4))

但是我遇到此错误:foo = json.loads(foo)行上的JSONDecodeError: Extra data.

But I'm getting this error: JSONDecodeError: Extra data on the foo = json.loads(foo) line.

我已经通过仅修改原始json中的1个配置文件并将此配置文件写入cleaned.json进行了测试,而cleaned.json看起来像这样(除了全部在一行上,我只是漂亮地打印了一下)这篇文章):

I've tested this by only modifying 1 profile from the original json and writing this modified profile to cleaned.json, and cleaned.json looks like this (except it's all on one line, I've just pretty printed it for this post):

{
    "skills": [
        "Key Account Development",
        "Strategic Planning",
        "Market Planning",
        "Team Leadership",
        "Negotiation",
        "Forecasting",
        "Key Account Management",
        "Sales Management",
        "New Business Development",
        "Business Planning",
        "Cross-functional Team Leadership",
        "Budgeting",
        "Strategy Development",
        "Business Strategy",
        "Consultative Selling",
        "Medical Devices",
        "Customer Relations",
        "Contract Negotiation",
        "Mentoring",
        "Coaching",
        "Healthcare",
        "Territory",
        "Sales Process",
        "Direct Sales",
        "Sales Operations",
        "Pharmaceutical Sales"
    ],
    "industry": "Medical Devices",
    "summary": "SALES MANAGEMENT / BUSINESS DEVELOPMENT / PROJECT MANAGEMENTDOMESTIC & INTERNATIONAL KEY ACCOUNT MANAGEMENTBusiness and Sales Executive with 20 years of accomplished career track, reflecting extensive experience and dynamic record-breaking performance in the Medical Industry markets. Exceptional communicator, strong team player, flexible self-starter with consultative sales style, strong negotiations skills, exceptional problem solving abilities, and accurate customer assessment aptitude. Manage and lead teams to success, drive new business through key accounts management, establish partnerships, manage solid distributor relationship for increased profitability and sales volumes. Very well organized, accurate and on-time administrative work, with a track record that demonstrates self-motivation, creativity, sales team leadership, initiative to achieve corporate, team and personal goals. Experience in the following markets: Medical Devices, Medical Disposables, Capital Equipment, Pharmaceuticals."
}{
    "education": [
        {
            "start": "2008",
            "major": "Economics",
            "end": "2008",
            "name": "Columbia University - Columbia Business School",
            "desc": "Coursework \"Principals of Economics\" ECON1105\tSpring 2008"
        },
        {
            "start": "2007",
            "end": "2007",
            "name": "Columbia University - Columbia Business School"
        },
        {
            "major": "Cancer genomics",
            "end": "2001",
            "name": "G\u00f6teborgs universitet",
            "degree": "Ph.D.",
            "start": "1996",
            "desc": "Thesis: \"The role of p53 in tumor progression and prognosis in patients with primary colorectal cancer\""
        },
        {
            "start": "1994",
            "major": "Biology, Medicine;German Language",
            "end": "1995",
            "name": "Universit\u00e4t Regensburg",
            "degree": "Cancer Research, Coursework"
        },
        {
            "major": "Biology",
            "end": "1994",
            "name": "G\u00f6teborgs universitet",
            "degree": "Master",
            "start": "1989",
            "desc": ""
        },
        {
            "start": "1992",
            "major": "50% Biology and Medicine, 50% mixed music, sports, computer science, art etc",
            "end": "1993",
            "name": "The University of Georgia",
            "desc": "Scholarship for one full year of Graduate Studies."
        }
    ],
    "skills": [
        "Molecular Biology",
        "Biomarkers"
    ],
    "industry": "Pharmaceuticals",
    "experience": [
        {
            "org": "Johnson and Johnson",
            "title": "Senior Scientist, Oncology Biomarkers",
            "end": "Present",
            "start": "November 2009",
            "desc": "Biomarker Leader for compounds in clinical development.*Developing and implementing predictive and pharmacodynamic biomarkers for the use in Phase 0 - III oncology clinical trials.."
        },
        {
            "org": "Albert Einstein Medical Center",
            "title": "Associate at Dept of Molecular Genetics",
            "start": "September 2008",
            "desc": "Single Cell Gene expression."
        },
        {
            "org": "Columbia University",
            "title": "Associate Research Scientist",
            "start": "August 2006",
            "desc": "Work on peptide to restore wt p53 function in cancer."
        },
        {
            "org": "Memorial Sloan Kettering Cancer Center",
            "title": "Post Doctoral Research Fellow",
            "start": "January 2003",
            "desc": "Molecular profiling of colorectal cancer."
        },
        {
            "org": "Sahlgrenska University Hospital",
            "title": "Research Scientist",
            "start": "November 2001",
            "desc": "Cancer Research at Dept of Surgery.Molecular profiling of Colorectal Cancer with focus on p53."
        }
    ],
    "summary": "Ph.D. scientist with background in cancer research, translational medicine and early drug development with special focus on biomarkers and personalized medicine."
}

所以当我读到这篇文章时,我得到了错误.我究竟做错了什么?我想我将个人资料写到cleaned.json的方式有问题吗?

So when I read this in, I'm getting the error. What am I doing wrong? I guess there is something wrong with the way I'm writing the profile to cleaned.json?

用于测试的示例输入

样本输入具有3个配置文件.

Sample input has 3 profiles.

{"_id": "in-00000001", "name": {"family_name": "Mazalu MBA", "given_name": "Dr Catalin"}, "locality": "United States", "skills": ["Key Account Development", "Strategic Planning", "Market Planning", "Team Leadership", "Negotiation", "Forecasting", "Key Account Management", "Sales Management", "New Business Development", "Business Planning", "Cross-functional Team Leadership", "Budgeting", "Strategy Development", "Business Strategy", "Consultative Selling", "Medical Devices", "Customer Relations", "Contract Negotiation", "Mentoring", "Coaching", "Healthcare", "Territory", "Sales Process", "Direct Sales", "Sales Operations", "Pharmaceutical Sales"], "industry": "Medical Devices", "summary": "SALES MANAGEMENT / BUSINESS DEVELOPMENT / PROJECT MANAGEMENTDOMESTIC & INTERNATIONAL KEY ACCOUNT MANAGEMENTBusiness and Sales Executive with 20 years of accomplished career track, reflecting extensive experience and dynamic record-breaking performance in the Medical Industry markets. Exceptional communicator, strong team player, flexible self-starter with consultative sales style, strong negotiations skills, exceptional problem solving abilities, and accurate customer assessment aptitude. Manage and lead teams to success, drive new business through key accounts management, establish partnerships, manage solid distributor relationship for increased profitability and sales volumes. Very well organized, accurate and on-time administrative work, with a track record that demonstrates self-motivation, creativity, sales team leadership, initiative to achieve corporate, team and personal goals. Experience in the following markets: Medical Devices, Medical Disposables, Capital Equipment, Pharmaceuticals.", "url": "http://www.linkedin.com/in/00000001", "also_view": [{"url": "http://www.linkedin.com/pub/krisa-drost/45/909/513", "id": "pub-krisa-drost-45-909-513"}, {"url": "http://ro.linkedin.com/pub/florin-ut/18/b33/77b", "id": "pub-florin-ut-18-b33-77b"}, {"url": "http://ro.linkedin.com/pub/cristian-radu/21/225/149", "id": "pub-cristian-radu-21-225-149"}, {"url": "http://ro.linkedin.com/pub/traian-rusu/16/652/279", "id": "pub-traian-rusu-16-652-279"}, {"url": "http://ro.linkedin.com/pub/dumitrescu-catalin/3/283/92", "id": "pub-dumitrescu-catalin-3-283-92"}, {"url": "http://www.linkedin.com/pub/jody-brelsford/9/21a/354", "id": "pub-jody-brelsford-9-21a-354"}, {"url": "http://www.linkedin.com/pub/mary-anne-dilloway/2/55a/18", "id": "pub-mary-anne-dilloway-2-55a-18"}, {"url": "http://ro.linkedin.com/pub/carmen-baleanu/2b/252/203", "id": "pub-carmen-baleanu-2b-252-203"}, {"url": "http://il.linkedin.com/in/shimonlobel", "id": "in-shimonlobel"}, {"url": "http://ro.linkedin.com/pub/monica-danilescu/19/36a/121", "id": "pub-monica-danilescu-19-36a-121"}]}
{"_id": "in-00001", "education": [{"start": "2008", "major": "Economics", "end": "2008", "name": "Columbia University - Columbia Business School", "desc": "Coursework \"Principals of Economics\" ECON1105\tSpring 2008"}, {"start": "2007", "end": "2007", "name": "Columbia University - Columbia Business School"}, {"major": "Cancer genomics", "end": "2001", "name": "G\u00f6teborgs universitet", "degree": "Ph.D.", "start": "1996", "desc": "Thesis: \"The role of p53 in tumor progression and prognosis in patients with primary colorectal cancer\""}, {"start": "1994", "major": "Biology, Medicine;German Language", "end": "1995", "name": "Universit\u00e4t Regensburg", "degree": "Cancer Research, Coursework"}, {"major": "Biology", "end": "1994", "name": "G\u00f6teborgs universitet", "degree": "Master", "start": "1989", "desc": ""}, {"start": "1992", "major": "50% Biology and Medicine, 50% mixed music, sports, computer science, art etc", "end": "1993", "name": "The University of Georgia", "desc": "Scholarship for one full year of Graduate Studies."}], "group": {"affilition": ["ASMALLWORLD.net", "Biomarker Research & Executive Network", "Biomarker Society", "Biomarkers", "Biomarkers in Discovery, Development and the Clinic Network", "Biotechnology/Pharmaceuticals", "Circulating Tumor Cell (CTC) and Cancer Stem Cell Group", "Clinical Development Job Opportunities - Europe", "Epigenetics", "Molecular Diagnostics Professional Network", "Molecular Diagnostics for Cancer Drug Development Forum", "NYC Women in Biotech", "Oncology Drug Development (Premier Group For Cancer Drug Development)", "Oncology Pharma\u2122", "Personalized Medicine", "Personalized Oncology Medicine - Global Group", "Professionals in the Pharmaceutical and Biotech Industry", "Svenskar i New York", "Translational Medicine Alliance"]}, "name": {"family_name": "Forslund", "given_name": "Ann"}, "overview_html": "<dl id=\"overview\"><dt id=\"overview-summary-current-title\" class=\"summary-current\" style=\"display:block\">\nCurrent\n</dt>\n<dd class=\"summary-current\" style=\"display:block\">\n<ul class=\"current\"><li>\nSenior Scientist, Oncology Biomarkers\n<span class=\"at\">at </span>\n<a class=\"company-profile-public\" href=\"/company/johnson-&amp;-johnson?trk=ppro_cprof\"><span class=\"org summary\">Johnson and Johnson</span></a>\n</li>\n</ul></dd>\n<dt id=\"overview-summary-past-title\" class=\"summary-past\" style=\"display:block\">\nPast\n</dt>\n<dd class=\"summary-past\" style=\"display:block\">\n<ul class=\"past\"><li>\nAssociate at Dept of Molecular Genetics\n<span class=\"at\">at </span>\n<a class=\"company-profile-public\" href=\"/company/einstein-medical-center-philadelphia?trk=ppro_cprof\"><span class=\"org summary\">Albert Einstein Medical Center</span></a>\n</li>\n<li>\nAssociate Research Scientist\n<span class=\"at\">at </span>\n<a class=\"company-profile-public\" href=\"/company/columbia-university?trk=ppro_cprof\"><span class=\"org summary\">Columbia University</span></a>\n</li>\n<li>\nPost Doctoral Research Fellow\n<span class=\"at\">at </span>\nMemorial Sloan Kettering Cancer Center\n</li>\n</ul><div class=\"showhide-block\" id=\"morepast\">\n<ul class=\"past\"><li>\nResearch Scientist\n<span class=\"at\">at </span>\n<a class=\"company-profile-public\" href=\"/company/sahlgrenska-university-hospital?trk=ppro_cprof\"><span class=\"org summary\">Sahlgrenska University Hospital</span></a>\n</li>\n</ul><p class=\"seeall showhide-link\"><a href=\"#\" id=\"morepast-hide\">see less</a></p>\n</div>\n<p class=\"seeall showhide-link\"><a href=\"#\" id=\"morepast-show\">see all</a></p>\n</dd>\n<dt id=\"overview-summary-education-title\" class=\"summary-education\" style=\"display:block\">\nEducation\n</dt>\n<dd class=\"summary-education\" style=\"display:block\">\n<ul><li>\nColumbia University - Columbia Business School\n</li>\n<li>\nColumbia University - Columbia Business School\n</li>\n<li>\nG\u00f6teborgs universitet\n</li>\n</ul><div class=\"showhide-block\" id=\"moreedu\">\n<ul><li>\n<div name=\"education\">\nUniversit\u00e4t Regensburg\n</div>\n</li>\n<li>\n<div name=\"education\">\nG\u00f6teborgs universitet\n</div>\n</li>\n<li>\n<div name=\"education\">\nThe University of Georgia\n</div>\n</li>\n</ul><p class=\"seeall showhide-link\"><a href=\"#\" id=\"moreedu-hide\">see less</a></p>\n</div>\n<p class=\"seeall showhide-link\"><a href=\"#\" id=\"moreedu-show\">see all</a></p>\n</dd>\n<dt>\nConnections\n</dt>\n<dd class=\"overview-connections\">\n<p>\n<strong>244</strong> connections\n</p>\n</dd>\n</dl>", "locality": "Antwerp Area, Belgium", "skills": ["Molecular Biology", "Biomarkers"], "industry": "Pharmaceuticals", "interval": 20, "experience": [{"org": "Johnson and Johnson", "title": "Senior Scientist, Oncology Biomarkers", "end": "Present", "start": "November 2009", "desc": "Biomarker Leader for compounds in clinical development.*Developing and implementing predictive and pharmacodynamic biomarkers for the use in Phase 0 - III oncology clinical trials.."}, {"org": "Albert Einstein Medical Center", "title": "Associate at Dept of Molecular Genetics", "start": "September 2008", "desc": "Single Cell Gene expression."}, {"org": "Columbia University", "title": "Associate Research Scientist", "start": "August 2006", "desc": "Work on peptide to restore wt p53 function in cancer."}, {"org": "Memorial Sloan Kettering Cancer Center", "title": "Post Doctoral Research Fellow", "start": "January 2003", "desc": "Molecular profiling of colorectal cancer."}, {"org": "Sahlgrenska University Hospital", "title": "Research Scientist", "start": "November 2001", "desc": "Cancer Research at Dept of Surgery.Molecular profiling of Colorectal Cancer with focus on p53."}], "summary": "Ph.D. scientist with background in cancer research, translational medicine and early drug development with special focus on biomarkers and personalized medicine.", "url": "http://be.linkedin.com/in/00001", "also_view": [{"url": "http://www.linkedin.com/pub/peter-king/4/993/a16", "id": "pub-peter-king-4-993-a16"}, {"url": "http://www.linkedin.com/pub/hans-winkler/1/1ab/78a", "id": "pub-hans-winkler-1-1ab-78a"}, {"url": "http://de.linkedin.com/pub/michael-koslowski/26/964/99b", "id": "pub-michael-koslowski-26-964-99b"}, {"url": "http://de.linkedin.com/pub/werner-seiz/b/14/436", "id": "pub-werner-seiz-b-14-436"}, {"url": "http://de.linkedin.com/pub/miro-venturi/7/725/217", "id": "pub-miro-venturi-7-725-217"}, {"url": "http://ch.linkedin.com/pub/lisa-d-amato/3/808/267", "id": "pub-lisa-d-amato-3-808-267"}, {"url": "http://www.linkedin.com/pub/june-kaplow-ph-d/2/382/924", "id": "pub-june-kaplow-ph-d-2-382-924"}, {"url": "http://fr.linkedin.com/pub/fabien-schmidlin/b/b73/4b2", "id": "pub-fabien-schmidlin-b-b73-4b2"}, {"url": "http://be.linkedin.com/pub/tine-casneuf/2/563/884", "id": "pub-tine-casneuf-2-563-884"}, {"url": "http://be.linkedin.com/pub/jeroen-aerssens/0/b9a/6ba", "id": "pub-jeroen-aerssens-0-b9a-6ba"}], "specilities": "Biomarkers in Oncology, Cancer Genomics, Molecular Profiling of Cancer, Translational Cancer Research, Early Development Drug Discovery", "events": [{"from": "Sahlgrenska University Hospital", "to": "Memorial Sloan Kettering Cancer Center", "title1": "Research Scientist", "start": 24022, "title2": "Post Doctoral Research Fellow", "end": 24036}, {"from": "Memorial Sloan Kettering Cancer Center", "to": "Columbia University", "title1": "Post Doctoral Research Fellow", "start": 24036, "title2": "Associate Research Scientist", "end": 24079}, {"from": "Columbia University", "to": "Albert Einstein Medical Center", "title1": "Associate Research Scientist", "start": 24079, "title2": "Associate at Dept of Molecular Genetics", "end": 24104}, {"from": "Albert Einstein Medical Center", "to": "Johnson and Johnson", "title1": "Associate at Dept of Molecular Genetics", "start": 24104, "title2": "Senior Scientist, Oncology Biomarkers", "end": 24118}]}
{"_id": "in-00006", "interests": "personal genomics, nanotechnology", "education": [{"major": "Biophysics", "end": "2009", "name": "Harvard University", "degree": "Ph.D", "start": "2004", "desc": ""}, {"major": "Computer Science", "end": "2003", "name": "Yale University", "degree": "B.S.", "start": "1999", "desc": ""}], "name": {"family_name": "Douglas", "given_name": "Shawn"}, "overview_html": "<dl id=\"overview\"><dt id=\"overview-summary-current-title\" class=\"summary-current\" style=\"display:block\">\nCurrent\n</dt>\n<dd class=\"summary-current\" style=\"display:block\">\n<ul class=\"current\"><li>\nAssistant Professor\n<span class=\"at\">at </span>\nUCSF\n</li>\n</ul></dd>\n<dt id=\"overview-summary-past-title\" class=\"summary-past\" style=\"display:block\">\nPast\n</dt>\n<dd class=\"summary-past\" style=\"display:block\">\n<ul class=\"past\"><li>\nTechnology Development Fellow\n<span class=\"at\">at </span>\n<a class=\"company-profile-public\" href=\"/company/wyss-institute-for-biologically-inspired-engineering?trk=ppro_cprof\"><span class=\"org summary\">Wyss Institute for Biologically Inspired Engineering</span></a>\n</li>\n</ul></dd>\n<dt id=\"overview-summary-education-title\" class=\"summary-education\" style=\"display:block\">\nEducation\n</dt>\n<dd class=\"summary-education\" style=\"display:block\">\n<ul><li>\nHarvard University\n</li>\n<li>\nYale University\n</li>\n</ul></dd>\n<dt>\nConnections\n</dt>\n<dd class=\"overview-connections\">\n<p>\n<strong>164</strong> connections\n</p>\n</dd>\n<dt class=\"websites\">Websites</dt>\n<dd class=\"websites\">\n<ul><li>\n<a href=\"/redir/redirect?url=http%3A%2F%2Fbionano%2Eucsf%2Eedu%2F&amp;urlhash=JefI\" target=\"_blank\" title=\"New window will open\" name=\"overviewsite\">\nCompany Website\n</a>\n</li>\n<li>\n<a href=\"/redir/redirect?url=http%3A%2F%2Fwww%2Eshawndouglas%2Ecom%2F&amp;urlhash=Loa8\" target=\"_blank\" title=\"New window will open\" name=\"overviewsite\">\nPersonal Website\n</a>\n</li>\n<li>\n<a href=\"/redir/redirect?url=http%3A%2F%2Fbiomod%2Enet%2F&amp;urlhash=vQXo\" target=\"_blank\" title=\"New window will open\" name=\"overviewsite\">\nBIOMOD\n</a>\n</li>\n</ul></dd>\n</dl>", "locality": "San Francisco, California", "skills": ["DNA", "Nanotechnology", "Molecular Biology", "Software Development"], "industry": "Research", "interval": 0, "experience": [{"org": "UCSF", "title": "Assistant Professor", "end": "Present", "start": "September 2012"}, {"org": "Wyss Institute for Biologically Inspired Engineering", "title": "Technology Development Fellow", "start": "May 2009"}], "summary": "I am interested in inventing new methods to construct and manipulate biological molecules at the nanometer scale, toward developing new scientific tools and therapeutic devices.", "url": "http://www.linkedin.com/in/00006", "also_view": [{"url": "http://www.linkedin.com/pub/george-church/1/630/2b8", "id": "pub-george-church-1-630-2b8"}, {"url": "http://www.linkedin.com/pub/andrew-hessel/4/4b0/290", "id": "pub-andrew-hessel-4-4b0-290"}, {"url": "http://www.linkedin.com/pub/ayis-antoniou/0/216/630", "id": "pub-ayis-antoniou-0-216-630"}, {"url": "http://uk.linkedin.com/pub/matthew-bellis/35/973/888", "id": "pub-matthew-bellis-35-973-888"}, {"url": "http://www.linkedin.com/pub/john-mulligan-ph-d/7/5a3/5aa", "id": "pub-john-mulligan-ph-d-7-5a3-5aa"}, {"url": "http://www.linkedin.com/pub/yang-mao/38/621/a83", "id": "pub-yang-mao-38-621-a83"}, {"url": "http://www.linkedin.com/pub/sidney-wang/25/3b8/b84", "id": "pub-sidney-wang-25-3b8-b84"}, {"url": "http://www.linkedin.com/pub/yang-mao/9/815/369", "id": "pub-yang-mao-9-815-369"}, {"url": "http://www.linkedin.com/pub/j-markson/32/572/10", "id": "pub-j-markson-32-572-10"}], "homepage": {"BIOMOD": ["http://biomod.net/"], "Company Website": ["http://bionano.ucsf.edu/"], "Personal Website": ["http://www.shawndouglas.com/"]}, "events": [{"from": "Wyss Institute for Biologically Inspired Engineering", "to": "UCSF", "title1": "Technology Development Fellow", "start": 24112, "title2": "Assistant Professor", "end": 24152}]}

推荐答案

以下代码似乎与您的示例输入配合使用.正如我在评论中说的那样,您要处理的文件采用的是 JSON行格式,而不是 JSON格式.

Here's code that seems to work with your sample input. As I said in a comment the file you are dealing with is in something called JSON Lines format rather than JSON format.

由于您似乎想要的是相同格式的已清除版本(换句话说,不是我想的那样转换为标准JSON格式),因此请执行以下操作:

Since you appear to want the cleaned version in that same format (in other words, not converted to standard JSON format, as I thought a one point), here's how to do that:

import json

path_to_file = "sample_input.json"
cleaned_file = "cleaned.json"

# Fields to keep.
fields = ["skills", "industry", "summary", "education", "experience"]

# Clean profiles in JSON Lines format file.
with open(path_to_file, encoding='UTF8') as inf, \
     open(cleaned_file, 'w', encoding='UTF8') as outf:

    for line in inf:
        profile = json.loads(line)  # Read a profile object.
        for key in list(profile.keys()):  # Remove unwanted fields it.
            if key not in fields:
                del profile[key]
        outf.write(json.dumps(profile) + '\n') # Write cleaned profile to new file

# Test whether it worked.
with open(cleaned_file, encoding='UTF8') as cleaned:
    for line in cleaned:
        profile = json.loads(line)
        print(json.dumps(profile, indent=4))

这篇关于写入JSON文件,然后读取相同的文件并获取"JSONDecodeError:额外数据"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆