pandas 读了杰森-追踪数据 [英] Pandas read Json - Trailing Data

查看:56
本文介绍了 pandas 读了杰森-追踪数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过Pandas pd.read_json读取较大的Json文件,但显示错误: ValueError:尾随数据

I am trying to read a large Json file through Pandas pd.read_json, but an error is showing: ValueError: Trailing data

根据我的研究,我没有成功,所以我想寻求您的帮助. 试图运行Json验证器,输出如下.

From my research here I was not successful, so I would like to ask for your help. Tried to run a Json validator and the output is below.

我该如何解决?谢谢

推荐答案

您显示的错误消息包含准确的位置, 问题的根源是:

The error messeage you presented contains precise location where the source of the problem is:

(At line #191), (At position #1)

查看JSON文件中的指定位置.

Look at the indicated place in your JSON file.

文件中一个奇怪的细节是逗号应在之后 190行,不是下一行的开始,但是我不确定是否 这实际上是任何问题.

A weird detail in your file is that the comma should be after "}" in line 190, not a the beginning of the next line, but I'm not sure whether this is actually any problem.

尝试部分读取",而对象不从第191行开始.

Attempt "partial reading", without the object starting at line 191.

要检查的更多详细信息:如果第190行中的}"终止了上面的整个 内容,然后:

One more detail to check: If "}" in line 190 terminates the whole above content, then:

  • 您的输入文件在主级别包含多个 JSON对象
  • 可能您应该将整个文件包含在"["和]"中,以便 整个文件将是对象列表.
  • your input file contains multiple JSON objects at the main level,
  • probably you should enclose the whole file with "[" and "]", so that the whole file will be a list of objects.

我做了这样的实验:

输入文件包含:

{
  "aa" : "aa1",
  "bb" : "bb1"
},
{
  "aa" : "aa2",
  "bb" : "bb2"
}

(主层有2个JSON对象).

(2 JSON objects at the main level).

然后pd.read_json('Input.json')引发 ValueError:尾随数据.

但是当我将输入文件更改为:

But when I changed the input file to:

[
  {
    "aa" : "aa1",
    "bb" : "bb1"
  },
  {
    "aa" : "aa2",
    "bb" : "bb2"
  }
]

(2个JSON对象的列表),我得到了正确的结果:

(a list of 2 JSON objects), I got a proper result:

    aa   bb
0  aa1  bb1
1  aa2  bb2

查看您的输入文件,也许在您的情况下,问题就像 我展示了.

Look at your input file, maybe in your case the problem is just like I showed.

又一次体验

输入文件包含:

{ "aa" : "aa1", "bb" : "bb1" }
{ "aa" : "aa2", "bb" : "bb2" }
{ "aa" : "aa3", "bb" : "bb3" }

即单独的对象,没有用 或在每个对象后都没有逗号.

i.e. separate objects, without surrounding "[" and "]" either with or without comma after each object.

您可以调用pd.read_json('Input.json', lines=True)来阅读它.

但是这里的限制是每行必须包含 complete JSON 对象,所以在您的情况下,它是毫无用处的.

But the limitation here is that each line must contain complete JSON object, so in your case it is rather useless.

这篇关于 pandas 读了杰森-追踪数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆