在Python中加载大型JCON文件-错误= JSONDecodeError:额外数据 [英] load large JCON file in Python - Error = JSONDecodeError: Extra data

查看:151
本文介绍了在Python中加载大型JCON文件-错误= JSONDecodeError:额外数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从yelp中在python中加载文件business.json 可用于其学术挑战的学术数据,请参见下文 ( https://www.yelp.com/dataset/documentation/json )我目标是提取所有餐厅及其ID,然后找到我自己的一家餐厅 有兴趣.有了这个餐厅ID后,我想加载 review.json并提取该给定餐厅的所有评论.可悲的是我 被困在着陆.json

I am trying to load in python the file business.json from yelp academic data available for their academic challenge, see below (https://www.yelp.com/dataset/documentation/json) My Goal is to extract all restaurant and their ID to then find the one restaurant I am interested for. Once I have this restaurant id, I want to load review.json and extract all reviews for that given restaurant. Sadly I am stuck at the initial stage of landing the .json

这是business.json的样子:

this is what business.json looks like:

{
    // string, 22 character unique string business id
    "business_id": "tnhfDv5Il8EaGSXZGiuQGg",

    // string, the business's name
    "name": "Garaje",

    // string, the neighborhood's name
    "neighborhood": "SoMa",

    // string, the full address of the business
    "address": "475 3rd St",

    // string, the city
    "city": "San Francisco",

    // string, 2 character state code, if applicable
    "state": "CA",

    // string, the postal code
    "postal code": "94107",

    // float, latitude
    "latitude": 37.7817529521,

    // float, longitude
    "longitude": -122.39612197,

    // float, star rating, rounded to half-stars
    "stars": 4.5,

    // interger, number of reviews
    "review_count": 1198,

    // integer, 0 or 1 for closed or open, respectively
    "is_open": 1,

    // object, business attributes to values. note: some attribute values might be objects
    "attributes": {
        "RestaurantsTakeOut": true,
        "BusinessParking": {
            "garage": false,
            "street": true,
            "validated": false,
            "lot": false,
            "valet": false
        },
    },

    // an array of strings of business categories
    "categories": [
        "Mexican",
        "Burgers",
        "Gastropubs"
    ],

    // an object of key day to value hours, hours are using a 24hr clock
    "hours": {
        "Monday": "10:00-21:00",
        "Tuesday": "10:00-21:00",
        "Friday": "10:00-21:00",
        "Wednesday": "10:00-21:00",
        "Thursday": "10:00-21:00",
        "Sunday": "11:00-18:00",
        "Saturday": "10:00-21:00"
    }
}

当我尝试使用以下代码导入business.json时:

When I try to import business.json with the following code:

import json

jsonBus = json.loads(open('business.json').read())
for item in jsonBus:
    name = item.get("Name")
    businessID = item.get("business_id")

我收到以下错误:

runfile('/Users/Nico/Google Drive/Python/yelp/yelp_academic.py', wdir='/Users/Nico/Google Drive/Python/yelp')
Traceback (most recent call last):

  File "<ipython-input-46-68ba9d6458bc>", line 1, in <module>
    runfile('/Users/Nico/Google Drive/Python/yelp/yelp_academic.py', wdir='/Users/Nico/Google Drive/Python/yelp')

  File "/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 710, in runfile
    execfile(filename, namespace)

  File "/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 101, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "/Users/Nico/Google Drive/Python/yelp/yelp_academic.py", line 3, in <module>
    jsonBus = json.loads(open('business.json').read())

  File "/anaconda3/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)

  File "/anaconda3/lib/python3.6/json/decoder.py", line 342, in decode
    raise JSONDecodeError("Extra data", s, end)

JSONDecodeError: Extra data

有人知道为什么会出现这种错误吗?

Does anyone know why such errors appears?

我也愿意采用任何更明智的方式进行操作!

I am also open to any smarter way to proceed!

最好

Nico

推荐答案

如果您的json文件与您提到的完全相同,则它不应包含注释(也称为// string, 22 character unique string business id),因为它们不是标准的一部分.

If your json file is exactly the same as you mentioned, it should not have comments (a.k.a. // string, 22 character unique string business id) as they are not a part of the standard.

请在此处查看相关文章:注释可以在JSON中使用吗?

Please see a related post here: Can comments be used in JSON?

这篇关于在Python中加载大型JCON文件-错误= JSONDecodeError:额外数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆