如何获取json文件的主结构? [英] How to obtain a master structure for a json file?

查看:51
本文介绍了如何获取json文件的主结构?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个JSON文件,如下所示:

I have a JSON file as follows:

[
    {
        "dog": "lmn",
        "tiger": [
            {
                "bengoltiger": {
                    "height": {
                        "x": 4
                    }
                },
                "indiantiger": {
                    "paw": "a",
                    "foor": "b"
                }
            },
            {
                "bengoltiger": {
                    "width": {
                        "a": 8
                    }
                },
                "indiantiger": {
                    "b": 3
                }
            }
        ]
    },
    {
        "dog": "pqr",
        "tiger": [
            {
                "bengoltiger": {
                    "width": {
                        "m": 3
                    }
                },
                "indiantiger": {
                    "paw": "a",
                    "foor": "b"
                }
            },
            {
                "bengoltiger": {
                    "height": {
                        "n": 8
                    }
                },
                "indiantiger": {
                    "b": 3
                }
            }
        ],
        "lion": 90
    }
]

我想对此进行转换,以获取任何嵌套级别的任何对象的所有可能的属性.对于数组,第一个对象应包含所有属性.这些值是微不足道的,但是以下解决方案考虑了任何属性的第一个遇到的值. (例如,为"dog"属性保留"lmn") 预期输出:

I want to transform this to obtain all possible properties of any object at any nesting level. For arrays, the first object should contain all the properties. The values are trivial, but the below solution considers the first encountered value for any property. (For ex. "lmn" is preserved for the "dog" property) Expected output:

[
    {
        "dog": "lmn",
        "tiger": [
            {
                "bengoltiger": {
                    "height": {
                        "x": 4,
                        "n": 8
                    },
                    "width": {
                        "a": 8,
                        "m": 3
                    }
                },
                "indiantiger": {
                    "paw": "a",
                    "foor": "b",
                    "b": 3
                }
            }
        ],
        "lion": 90
    }
]

这是我在此嵌套问题出现之前尝试过的递归函数

Here's a recursive function I tried before this nesting problem struck me

function consolidateArray(json) {
    if (Array.isArray(json)) {
      const reference = json[0];
      json.forEach(function(element) {
        for (var key in element) {
          if (!reference.hasOwnProperty(key)) {
            reference[key] = element[key];
          }
        }
      });
      json.splice(1);
      this.consolidateArray(json[0]);
    } else if (typeof json === 'object') {
      for (var key in json) {
        if (json.hasOwnProperty(key)) {
          this.consolidateArray(json[key]);
        }
      }
    }
  };
  
var json = [
    {
        "dog": "lmn",
        "tiger": [
            {
                "bengoltiger": {
                    "height": {
                        "x": 4
                    }
                },
                "indiantiger": {
                    "paw": "a",
                    "foor": "b"
                }
            },
            {
                "bengoltiger": {
                    "width": {
                        "a": 8
                    }
                },
                "indiantiger": {
                    "b": 3
                }
            }
        ]
    },
    {
        "dog": "pqr",
        "tiger": [
            {
                "bengoltiger": {
                    "width": {
                        "m": 3
                    }
                },
                "indiantiger": {
                    "paw": "a",
                    "foor": "b"
                }
            },
            {
                "bengoltiger": {
                    "height": {
                        "n": 8
                    }
                },
                "indiantiger": {
                    "b": 3
                }
            }
        ],
        "lion": 90
    }
];
consolidateArray(json);
alert(JSON.stringify(json, null, 2));

推荐答案

这是一个有趣的问题.这是我想出的:

This was an interesting problem. Here's what I came up with:

// Utility functions

const isInt = Number.isInteger

const path = (ps = [], obj = {}) =>
  ps .reduce ((o, p) => (o || {}) [p], obj)

const assoc = (prop, val, obj) => 
  isInt (prop) && Array .isArray (obj)
    ? [... obj .slice (0, prop), val, ...obj .slice (prop + 1)]
    : {...obj, [prop]: val}

const assocPath = ([p = undefined, ...ps], val, obj) => 
  p == undefined
    ? obj
    : ps.length == 0
      ? assoc(p, val, obj)
      : assoc(p, assocPath(ps, val, obj[p] || (obj[p] = isInt(ps[0]) ? [] : {})), obj)


// Helper functions

function * getPaths(o, p = []) {
  if (Object(o) !== o || Object .keys (o) .length == 0) yield p 
  if (Object(o) === o)
    for (let k of Object .keys (o))
      yield * getPaths (o[k], [...p, isInt (Number (k)) ? Number (k) : k])
}

const canonicalPath = (path) =>
  path.map (n => isInt (Number (n)) ? 0 : n)

const splitPaths = (xs) => 
  Object .values ( xs.reduce ( 
    (a, p, _, __, cp = canonicalPath (p), key = cp .join ('\u0000')) => 
      ({...a, [key]: a [key] || {canonical: cp, path: p} })
    , {}
  ))


// Main function

const canonicalRep = (data) => splitPaths ([...getPaths (data)]) 
  .reduce (
    (a, {path:p, canonical}) => assocPath(canonical, path(p, data), a),
    Array.isArray(data) ? [] : {}
  ) 


  // Test

const data = [{"dog": "lmn", "tiger": [{"bengoltiger": {"height": {"x": 4}}, "indiantiger": {"foor": "b", "paw": "a"}}, {"bengoltiger": {"width": {"a": 8}}, "indiantiger": {"b": 3}}]}, {"dog": "pqr", "lion": 90, "tiger": [{"bengoltiger": {"width": {"m": 3}}, "indiantiger": {"foor": "b", "paw": "a"}}, {"bengoltiger": {"height": {"n": 8}}, "indiantiger": {"b": 3}}]}]

console .log (
  canonicalRep (data)
)

前几个函数是我将保留在系统库中的普通实用程序函数.在此代码之外,它们还有很多用途:

The first few functions are plain utility functions that I would keep in a system library. They have plenty of uses outside this code:

  • isInt仅仅是Number.isInteger

path查找沿给定路径的对象的嵌套属性

path finds the nested property of an object along a given pathway

path(['b', 1, 'c'], {a: 10, b: [{c: 20, d: 30}, {c: 40}], e: 50}) //=> 40 

  • assoc返回一个新对象,该对象将克隆原始对象,但是将某个属性的值设置为或替换为提供的对象.

  • assoc returns a new object cloning your original, but with the value of a certain property set to or replaced with the supplied one.

    assoc('c', 42, {a: 1, b: 2, c: 3, d: 4}) //=> {a: 1, b: 2, c: 42, d: 4}
    

    请注意,内部对象尽可能通过引用共享.

    Note that internal objects are shared by reference where possible.

    assocPath做同样的事情,但是路径更深,可以根据需要构建节点.

    assocPath does this same thing, but with a deeper path, building nodes as needed.

    assocPath(['a', 'b', 1, 'c', 'd'], 42, {a: {b: [{x: 1}, {x: 2}], e: 3})
        //=> {a: {b: [{x: 1}, {c: {d: 42}, x: 2}], e: 3}} 
    

  • 除了isInt之外,这些都是从 Ramda 借用它们的API. (免责声明:我是Ramda的作者.)但是这些都是独特的实现.

    Except for isInt, these borrow their APIs from Ramda. (Disclaimer: I'm a Ramda author.) But these are unique implementations.

    下一个功能getPaths另一个SO答案中的一个功能的改编.它以pathassocPath所使用的格式列出对象中的所有路径,如果相关的嵌套对象是数组,则返回整数数组,否则返回字符串.与从中借用的函数不同,它仅返回叶值的路径.

    The next function, getPaths, is an adaptation of one from another SO answer. It lists all the paths in your object in the format used by path and assocPath, returning an array of values which are integers if the relevant nested object is an array and strings otherwise. Unlike the function from which is was borrowed, it only returns paths to leaf values.

    对于您的原始对象,它返回此数据的迭代器:

    For your original object, it returns an iterator for this data:

    [
      [0, "dog"], 
      [0, "tiger", 0, "bengoltiger", "height", "x"], 
      [0, "tiger", 0, "indiantiger", "foor"], 
      [0, "tiger", 0, "indiantiger", "paw"], 
      [0, "tiger", 1, "bengoltiger", "width", "a"], 
      [0, "tiger", 1, "indiantiger", "b"], 
      [1, "dog"], 
      [1, "lion"], 
      [1, "tiger", 0, "bengoltiger", "width", "m"], 
      [1, "tiger", 0, "indiantiger", "foor"], 
      [1, "tiger", 0, "indiantiger", "paw"], 
      [1, "tiger", 1, "bengoltiger", "height", "n"], 
      [1, "tiger", 1, "indiantiger", "b"]
    ]
    

    如果我想花更多的时间在上面,我将用非生成器版本替换getPaths版本,以保持此代码的一致性.这应该不难,但是我对花更多的时间不感兴趣.

    If I wanted to spend more time on this, I would replace that version of getPaths with a non-generator version, just to keep this code consistent. It shouldn't be hard, but I'm not interested in spending more time on it.

    我们不能直接使用这些结果来构建您的输出,因为它们引用的是第一个数组元素之外的数组元素.这就是splitPaths及其帮助程序canonicalPath进入的地方.我们通过将所有整数替换为0来创建规范路径,从而为我们提供了这样的数据结构:

    We can't use those results directly to build your output, since they refer to array elements beyond the first one. That's where splitPaths and its helper canonicalPath come in. We create the canonical paths by replacing all integers with 0, giving us a data structure like this:

    [{
      canonical: [0, "dog"],
      path:      [0, "dog"]
    }, {
      canonical: [0, "tiger", 0, "bengoltiger", "height", "x"],
      path:      [0, "tiger", 0, "bengoltiger", "height", "x"]
    }, {
      canonical: [0, "tiger", 0, "indiantiger", "foor"], 
      path:      [0, "tiger", 0, "indiantiger", "foor"]
    }, {
      canonical: [0, "tiger", 0, "indiantiger", "paw"],
      path:      [0, "tiger", 0, "indiantiger", "paw"]
    }, {
      canonical: [0, "tiger", 0, "bengoltiger", "width", "a"], 
      path:      [0, "tiger", 1, "bengoltiger", "width", "a"]
    }, {
      canonical: [0, "tiger", 0, "indiantiger", "b"], 
      path:      [0, "tiger", 1, "indiantiger", "b"]
    }, {
      canonical: [0, "lion"], 
      path:      [1, "lion"]
    }, {
      canonical: [0, "tiger", 0, "bengoltiger", "width", "m"], 
      path:      [1, "tiger", 0, "bengoltiger", "width", "m"]
    }, {
      canonical: [0, "tiger", 0, "bengoltiger", "height", "n"], 
      path:      [1, "tiger", 1, "bengoltiger", "height", "n"]
    }]
    

    请注意,此功能还会删除重复的规范路径.我们最初同时具有[0, "tiger", 0, "indiantiger", "foor"][1, "tiger", 0, "indiantiger", "foor"],但是输出仅包含第一个.

    Note that this function also removes duplicate canonical paths. We originally had both [0, "tiger", 0, "indiantiger", "foor"] and [1, "tiger", 0, "indiantiger", "foor"], but the output only contains the first one.

    通过将它们与不可打印字符\u0000结合在一起而创建的键下,将它们存储在一个对象中来实现此目的.这是完成此任务的最简单方法,但是失败模式极有可能是 1 ,因此,如果我们真的希望,我们可以进行更复杂的重复检查.我不会打扰的.

    It does this by storing them in an object under a key created by joining the path together with the non-printable character \u0000. This was the easiest way to accomplish this task, but there is an extremely unlikely failure mode possible 1 so if we really wanted we could do a more sophisticated duplicate checking. I wouldn't bother.

    最后,主函数canonicalRep通过调用splitPaths并将结果折叠起来,使用canonical说出要放置新数据的位置,并应用path函数,从而在您的对象中构建表示形式到您的path属性和原始对象.

    Finally, the main function, canonicalRep builds a representation out of your object by calling splitPaths and folding over the result, using canonical to say where to put the new data, and applying the path function to your path property and the original object.

    根据要求,我们的最终输出如下:

    Our final output, as requested, looks like this:

    [
        {
            dog: "lmn",
            lion: 90,
            tiger: [
                {
                    bengoltiger: {
                        height: {
                            n: 8,
                            x: 4
                        },
                        width: {
                            a: 8,
                            m: 3
                        }
                    },
                    indiantiger: {
                        b: 3,
                        foor: "b",
                        paw: "a"
                    }
                }
            ]
        }
    ]
    

    令我着迷的是,尽管我无法想象它有任何实际用途,但我将其视为有趣的编程挑战.但是,既然我已经对其进行了编码,我意识到它将解决几周前搁置的当前项目中的一个问题.我可能会在星期一实施!

    What's fascinating for me is that I saw this as an interesting programming challenge, although I couldn't really imagine any practical uses for it. But now that I've coded it, I realize it will solve a problem in my current project that I'd put aside a few weeks ago. I will probably implement this on Monday!

    一些评论讨论了随后的空值试图覆盖先前的填充值的问题,从而导致数据丢失.

    Some comments discuss a problem with a subsequent empty value tries to override a prior filled value, causing a loss in data.

    此版本尝试通过以下主要功能来缓解这种情况:

    This version attempts to alleviate this with the following main function:

    const canonicalRep = (data) => splitPaths ([...getPaths (data)]) 
      .reduce (
        (a, {path: p, canonical}, _, __, val = path(p, data)) => 
          isEmpty(val) && !isEmpty(path(canonical, a))
            ? a
            : assocPath(canonical, val, a),
        Array.isArray(data) ? [] : {}
      ) 
    

    使用简单的isEmpty辅助函数:

    const isEmpty = (x) => 
      x == null || (typeof x == 'object' && Object.keys(x).length == 0) 
    

    您可能想以各种方式更新或扩展此帮助程序.

    You might want to update or expand this helper in various ways.

    我的第一遍可以很好地处理提供的备用数据,但是当我切换外部数组中的两个条目时效果不佳.我对此进行了修复,并且还确保如果未用实际数据覆盖它为空值 (这是测试对象中的z属性.)

    My first pass worked fine with the alternate data supplied, but not when I switched the two entries in the outer array. I fixed that, and also made sure that an empty value is kept if it's not overridden with actual data (that's the z property in my test object.)

    我相信这段代码可以解决原始问题和新问题:

    I believe this snippet solves the original problem and the new one:

    // Utility functions
    const isInt = Number.isInteger
    
    const path = (ps = [], obj = {}) =>
      ps .reduce ((o, p) => (o || {}) [p], obj)
    
    const assoc = (prop, val, obj) => 
      isInt (prop) && Array .isArray (obj)
        ? [... obj .slice (0, prop), val, ...obj .slice (prop + 1)]
        : {...obj, [prop]: val}
    
    const assocPath = ([p = undefined, ...ps], val, obj) => 
      p == undefined
        ? obj
        : ps.length == 0
          ? assoc(p, val, obj)
          : assoc(p, assocPath(ps, val, obj[p] || (obj[p] = isInt(ps[0]) ? [] : {})), obj)
    
    const isEmpty = (x) => 
      x == null || (typeof x == 'object' && Object.keys(x).length == 0) 
    
    function * getPaths(o, p = []) {
      if (Object(o) !== o || Object .keys (o) .length == 0) yield p 
      if (Object(o) === o)
        for (let k of Object .keys (o))
          yield * getPaths (o[k], [...p, isInt (Number (k)) ? Number (k) : k])
    }
    
    
    // Helper functions
    const canonicalPath = (path) =>
      path.map (n => isInt (Number (n)) ? 0 : n)
    
    const splitPaths = (xs) => 
      Object .values ( xs.reduce ( 
        (a, p, _, __, cp = canonicalPath (p), key = cp .join ('\u0000')) => 
          ({...a, [key]: a [key] || {canonical: cp, path: p} })
        , {}
      ))
    
    // Main function
    const canonicalRep = (data) => splitPaths ([...getPaths (data)]) 
      .reduce (
        (a, {path: p, canonical}, _, __, val = path(p, data)) => 
          isEmpty(val) && !isEmpty(path(canonical, a))
            ? a
            : assocPath(canonical, val, a),
        Array.isArray(data) ? [] : {}
      ) 
    
    
    // Test data
    const data1 = [{"dog": "lmn", "tiger": [{"bengoltiger": {"height": {"x": 4}}, "indiantiger": {"foor": "b", "paw": "a"}}, {"bengoltiger": {"width": {"a": 8}}, "indiantiger": {"b": 3}}]}, {"dog": "pqr", "lion": 90, "tiger": [{"bengoltiger": {"width": {"m": 3}}, "indiantiger": {"foor": "b", "paw": "a"}}, {"bengoltiger": {"height": {"n": 8}}, "indiantiger": {"b": 3}}]}]
    const data2 = [{"d": "Foreign Trade: Export/Import: Header Data", "a": "false", "f": [{"g": "TRANSPORT_MODE", "i": "2"}, {"k": "System.String", "h": "6"}], "l": "true"}, {"a": "false", "f": [], "l": "false", "z": []}]
    const data3 = [data2[1], data2[0]]
    
    // Demo
    console .log (canonicalRep (data1))
    console .log (canonicalRep (data2))
    console .log (canonicalRep (data3))

    .as-console-wrapper {max-height: 100% !important; top: 0}

    在我拒绝进行相同类型的空检查的编辑尝试之后,此更新不再引起讨论.在assoc内部.我拒绝了,因为它与最初的尝试相去甚远.当我知道它应该做什么时,我知道必须更改的是canonicalRep或其直接辅助功能之一.

    This update grew out of discussion after I rejected an edit attempt to do the same sort of empty-checking inside assoc. I rejected that as too far removed from the original attempt. When I learned what it was supposed to do, I knew that what had to be changed was canonicalRep or one of its immediate helper functions.

    基本原理很简单. assoc是通用实用程序功能,旨在对对象进行浅表克隆,将命名属性更改为新值.关于值是否为空,这不应具有复杂的逻辑.它应该保持简单.

    The rationale is simple. assoc is a general-purpose utility function designed to do a shallow clone of an object, changing the named property to the new value. This should not have complex logic regarding whether the value is empty. It should remain simple.

    通过引入isEmpty辅助函数,我们只需对canonicalRep进行细微调整即可完成所有这些操作.

    By introducing the isEmpty helper function, we can do all this with only a minor tweak to canonicalRep.

    1 如果某些节点包含该分隔符\u0000,则可能会发生这种故障模式.例如,如果您有路径[...nodes, "abc\u0000", "def", ...nodes][...nodes, "abc", "\u0000def", ...nodes],它们都将映射到"...abc\u0000\u0000def...".如果这是一个真正的问题,我们当然可以使用其他形式的重复数据删除.

    1That failure mode could happen if you had certain nodes containing that separator, \u0000. For instance, if you had paths [...nodes, "abc\u0000", "def", ...nodes] and [...nodes, "abc", "\u0000def", ...nodes], they would both map to "...abc\u0000\u0000def...". If this is a real concern, we could certainly use other forms of deduplication.

    这篇关于如何获取json文件的主结构?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆