无法解析JSON文件，持续获取ValueError:额外数据 [英] Unable to parse JSON file, keep getting ValueError: Extra Data

查看：203 发布时间：2019/11/26 21:56:53 python json python-2.7

本文介绍了无法解析JSON文件，持续获取ValueError:额外数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

因此，从我先前的问题[在这里找到] [1]开始，我试图解析一个在@SiHa的帮助下成功下载的JSON文件. JSON的结构如下:

So, leading on from my prior issue [found here][1], I'm attempting to parse a JSON file that I've managed to download with @SiHa's help. The JSON is structured like so:

{"properties": [{"property": "name", "value": "A random company name"}, {"property": "companyId", "value": 123456789}]}{"properties": [{"property": "name", "value": "Another random company name"}, {"property": "companyId", "value": 31415999}]}{"properties": [{"property": "name", "value": "Yet another random company"}, {"property": "companyId", "value": 10101010}]}

我已经能够通过稍微修改@SiHa的代码来获得它:

I've been able to get this by slightly modifiying @SiHa's code:

def get_companies():
            create_get_recent_companies_call = "https://api.hubapi.com/companies/v2/companies/?hapikey={hapikey}".format(hapikey=wta_hubspot_api_key)
            headers = {'content-type': 'application/json'}
            create_get_recent_companies_response = requests.get(create_get_recent_companies_call, headers=headers)
            if create_get_recent_companies_response.status_code == 200:
                while True:
                    for i in create_get_recent_companies_response.json()[u'companies']:

                        all_the_companies = { "properties": [
                                                    { "property": "name", "value": i[u'properties'][u'name'][u'value'] },
                                                    { "property": "companyId", "value": i[u'companyId'] }
                                                ]
                                            }

                        with open("all_the_companies.json", "a") as myfile:
                            myfile.write(json.dumps(all_the_companies))
                        #print(companyProperties)
                    offset = create_get_recent_companies_response.json()[u'offset']
                    hasMore = create_get_recent_companies_response.json()[u'has-more']
                    if not hasMore:
                        break
                    else:
                        create_get_recent_companies_call = "https://api.hubapi.com/companies/v2/companies/?hapikey={hapikey}&offset={offset}".format(hapikey=wta_hubspot_api_key, offset=offset)
                        create_get_recent_companies_response = requests.get(create_get_recent_companies_call, headers=headers)


            else:
                print("Something went wrong, check the supplied field values.\n")
                print(json.dumps(create_get_recent_companies_response.json(), sort_keys=True, indent=4))

那是第一部分.现在，我尝试使用下面的代码提取两件事:1)name和2)companyId.

So that was part one. Now I'm trying to use the code below to extract two things: 1) the name and 2) the companyId.

#!/usr/bin/python
# -*- coding: utf-8 -*-

import sys
import os.path
import requests
import json
import csv
import glob2
import shutil
import time
import time as howLong
from time import sleep
from time import gmtime, strftime

# Local Testing Version
findCSV = glob2.glob('*contact*.csv')

theDate = time=strftime("%Y-%m-%d", gmtime())
theTime = time=strftime("%H:%M:%S", gmtime())

# Exception handling
try:
    testData = findCSV[0]
except IndexError:
    print ("\nSyncronisation attempted on {date} at {time}: There are no \"contact\" CSVs, please upload one and try again.\n").format(date=theDate, time=theTime)
    print("====================================================================================================================\n")
    sys.exit()

for theCSV in findCSV:

    def process_companies():
        with open('all_the_companies.json') as data_file:
            data = json.load(data_file)
            for i in data:
                company_name = data[i][u'name']
                #print(company_name)
                if row[0].lower() == company_name.lower():
                    contact_company_id = data[i][u'companyId']
                    #print(contact_company_id)
                    return contact_company_id

                else:
                    print("Something went wrong, check the \"get_companies()\" function.\n")
                    print(json.dumps(create_get_recent_companies_response.json(), sort_keys=True, indent=4))

    if __name__ == "__main__":
        start_time = howLong.time()
        process_companies()
        print("This operation took %s seconds.\n" % (howLong.time() - start_time))
        sys.exit()

不幸的是，它不起作用-我得到了以下追溯:

Unfortunately, its not working - I'm getting the following traceback:

Traceback (most recent call last):
  File "wta_parse_json.py", line 62, in <module>
    process_companies()
  File "wta_parse_json.py", line 47, in process_companies
    data = json.load(data_file)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 290, in load
    **kw)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 369, in decode
    raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 1 column 130 - line 1 column 1455831 (char 129 - 1455830)

我确保我使用的是json.dumps而不是json.dump来打开文件，但仍然无法正常工作. :(

I've made sure that i'm using json.dumps not json.dump to open the file, but still its not working. :(

我现在已经放弃了JSON，并尝试使用以下代码导出简单的CSV:

I've now given up on JSON, and am trying to export a simple CSV with the code below:

    def get_companies():
            create_get_recent_companies_call = "https://api.hubapi.com/companies/v2/companies/?hapikey={hapikey}".format(hapikey=wta_hubspot_api_key)
            headers = {'content-type': 'application/json'}
            create_get_recent_companies_response = requests.get(create_get_recent_companies_call, headers=headers)
            if create_get_recent_companies_response.status_code == 200:
                while True:
                    for i in create_get_recent_companies_response.json()[u'companies']:

                        all_the_companies = "{name},{id}\n".format(name=i[u'properties'][u'name'][u'value'], id=i[u'companyId'])
                        all_the_companies.encode('utf-8')

                        with open("all_the_companies.csv", "a") as myfile:
                            myfile.write(all_the_companies)
                        #print(companyProperties)
                    offset = create_get_recent_companies_response.json()[u'offset']
                    hasMore = create_get_recent_companies_response.json()[u'has-more']
                    if not hasMore:
                        break
                    else:
                        create_get_recent_companies_call = "https://api.hubapi.com/companies/v2/companies/?hapikey={hapikey}&offset={offset}".format(hapikey=wta_hubspot_api_key, offset=offset)
                        create_get_recent_companies_response = requests.get(create_get_recent_companies_call, headers=headers)
  [1]: http://stackoverflow.com/questions/36148346/unable-to-loop-through-paged-api-responses-with-python

但是看起来这也不对-即使我已经阅读了格式问题，并添加了.encode('utf-8')添加项.我仍然最终得到以下回溯:

But it looks like this isn't right either - even though i've read up on the formatting issues, and have added the .encode('utf-8') additions. I still end up getting the following traceback:

Traceback (most recent call last):
  File "wta_get_companies.py", line 78, in <module>
    get_companies()
  File "wta_get_companies.py", line 57, in get_companies
    all_the_companies = "{name},{id}\n".format(name=i[u'properties'][u'name'][u'value'], id=i[u'companyId'])
UnicodeEncodeError: 'ascii' codec can't encode character u'\ufffd' in position 3: ordinal not in range(128)

无法解析JSON文件，持续获取ValueError:额外数据 [英] Unable to parse JSON file, keep getting ValueError: Extra Data

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

无法解析JSON文件，持续获取ValueError:额外数据 [英] Unable to parse JSON file, keep getting ValueError: Extra Data

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭