如何在不反序列化整个JSON文件的情况下仅解析特定对象? [英] How to parse only specific objects without deserializing the whole JSON file?

查看:126
本文介绍了如何在不反序列化整个JSON文件的情况下仅解析特定对象?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个巨大的JSON文件(成千上万个对象,> 100 MB文件),我试图对其进行解析以提取特定的对象.由于文件很大,因此我试图仅反序列化我需要的特定部分(如果可能的话),而不必反序列化整个文件.

I have a huge JSON file (tens of thousand of objects, >100 MB file) I'm trying to parse in order to extract specific objects. Since the file is this big I'm trying to deserialize only the specific part I need (if it's possible, that is) without having to deserialize the whole file.

应该根据每个对象中包含的特定属性"arena_id":xxxxx的值来找到所述对象,这些对象的格式如下(简化版本):

Said object should be found based on the the value of a specific property "arena_id":xxxxx contained in every object, objects that are formatted like this (stripped down version):

{"object":"card","id":"61a908e8-6952-46c0-94ec-3962b7a4caef","oracle_id":"e70f5520-1b9c-4351-8484-30f0dc692e01","multiverse_ids":[460007],"mtgo_id":71000,"arena_id":69421}

为了反序列化整个文件,我编写了以下代码:

In order to deserialize the whole file I wrote the following code:

public static RootObject GetCardFromBulkScryfall()
    {
        string s = null;

        using (StreamReader streamReader = new StreamReader(Path.Combine(GetAppDataPath(), @"scryfall-default-cards.json")))
        {
            s = streamReader.ReadToEnd();
        }

            RootObject card = JsonConvert.DeserializeObject<RootObject>(s);


        return card;
    }

我什至不确定我要做什么,但是如果不是我的问题,那么是一种处理如此大的文件而不必对整个文件进行反序列化的最佳方法是什么.

I'm not even sure if what I'm trying to do is possible but in case it wasn't my question is what's the best approach to handling a file this big without having to deserialize it whole.

推荐答案

使用带有 JsonTextWriter 的JsonTextReader 来枚举对象,然后在反序列化时他们的财产需要价值.

Use JsonTextReader with JsonTextWriter to enumerate objects then deserialize them if their property has needed value.

此代码在PC上占用16MB内存和112MB JSON文件.

This code takes 16MB of memory working with 112MB JSON file on my PC.

如果您有任何疑问或需要修复,请与我联系.

Let me know if you have questions or need fixes.

using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading;

namespace ConsoleApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            try
            {
                string jsonFilePath = "1.json";

                string propName = "arena_id";

                RootObject[] objects = SearchObjectsWithProperty<RootObject, int>(jsonFilePath, propName, 69421, CancellationToken.None).ToArray();

                System.Diagnostics.Debugger.Break();
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
                System.Diagnostics.Debugger.Break();
            }
        }

        static IEnumerable<T> SearchObjectsWithProperty<T, V>(string jsonFilePath, string propName, V propValue, CancellationToken cancellationToken) where V : IEquatable<V>
        {
            using (TextReader tr = File.OpenText(jsonFilePath))
            {
                using (JsonTextReader jr = new JsonTextReader(tr))
                {
                    StringBuilder currentObjectJson = new StringBuilder();

                    while (jr.Read())
                    {
                        cancellationToken.ThrowIfCancellationRequested();                        

                        if (jr.TokenType == JsonToken.StartObject)
                        {
                            currentObjectJson.Clear();

                            using (TextWriter tw = new StringWriter(currentObjectJson))
                            {
                                using (JsonTextWriter jw = new JsonTextWriter(tw))
                                {
                                    jw.WriteToken(jr);

                                    string currObjJson = currentObjectJson.ToString();

                                    JObject obj = JObject.Parse(currObjJson);

                                    if (obj[propName].ToObject<V>().Equals(propValue))
                                        yield return obj.ToObject<T>();
                                }
                            }
                        }
                    }
                }
            }
        }
    }

    public class RootObject
    {
        public string @object { get; set; }
        public string id { get; set; }
        public string oracle_id { get; set; }
        public int[] multiverse_ids { get; set; }
        public int mtgo_id { get; set; }
        public int arena_id { get; set; }
    }
}

这篇关于如何在不反序列化整个JSON文件的情况下仅解析特定对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆