JSON路径父对象或等效的MongoDB查询 [英] JSON path parent object, or equivalent MongoDB query

查看:266
本文介绍了JSON路径父对象或等效的MongoDB查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在选择JSON输入中的节点,但找不到一种方法来为我查询的每个数组条目包括父对象详细信息.我正在使用pentaho数据集成,以通过mongodb输入的JSON输入来查询数据.

我还试图创建一个mongodb查询来实现相同的目的,但似乎也无法做到这一点.

以下是显示数据的两个字段/路径:

$.size_break_costs [*].size $ .size_break_costs [*].quantity

这是json源格式:

{
"_id" : ObjectId("4f1f74ecde074f383a00000f"),
"colour" : "RAVEN-SMOKE",
"name" : "Authority",
"size_break_costs" : [
    {
        "quantity" : NumberLong("80"),
        "_id" : ObjectId("518ffc0697eee36ff3000002"),
        "size" : "S"
    },
    {
        "quantity" : NumberLong("14"),
        "_id" : ObjectId("518ffc0697eee36ff3000003"),
        "size" : "M"
    },
    {
        "quantity" : NumberLong("55"),
        "_id" : ObjectId("518ffc0697eee36ff3000004"),
        "size" : "L"
    }
],
"sku" : "SK3579"
}

我目前得到以下结果:

    S,80 
    M,14 
    L,55

我想获取SKU和名称,以及我的货源将有多个产品(SKU/描述):

    SK3579,Authority,S,80
    SK3579,Authority,M,14
    SK3579,Authority,L,55

当我尝试包括使用$ .sku时,我出现了处理错误.

我得到的最终结果是所有产品及其各种尺寸的可用数量的报告.可能有一个替代的mongodb查询可以提供此功能.

问题似乎可能是由于并非所有行都具有相同的结构.例如,上面包含3种尺寸-S,M,L.有些产品有两种尺寸-PACK.其他则有多种尺寸-28、30、32、33、34、36、38等.

产生的错误是:

*资源内部的数据结构不同!我们找到了json路径[$ .sku]的1个值,这与路径[$ .size_break_costs [].quantity]所遍历的数字不同(7个值).我们必须为所有路径使用相同数量的值.

我已经分别尝试了以下mongodb查询,该查询给出了正确结果,但是该方法的相应导出无效.没有返回大小和数量"的值.

查询:

db.product_details.find( {}, {sku: true, "size_break_costs.size": true, "size_break_costs.quantity": true}).pretty();

导出:

mongoexport --db brandscope_production --collection product_details --csv --out Test01.csv --fields sku,"size_break_costs.size","size_break_costs.quantity" --query '{}';

解决方案

添加赏金后不久,我便找到了解决方案.我的问题具有相同的基本结构,即父标识符,还有一些N个用于评级(质量,值等)的子键/值对.

首先,您需要一个JSON Input步骤,该步骤将SKU,Name和size_break_costs数组全部获取为字符串.重要的部分是size_break_costs是一个字符串,并且基本上只是一个字符串化的JSON数组.确保在"JSON输入"的内容"选项卡下,选中忽略缺少的路径",以防您得到一个空数组或字段由于某种原因而丢失的情况.

对于您的字段,请使用:

Name           | Path               | Type
ProductSKU     | $.sku              | String
ProductName    | $.name             | String
SizeBreakCosts | $.size_break_costs | String

在此步骤之后,我添加了一个过滤器行"块,条件为"SizeBreakCosts IS NOT NULL",然后将其传递给第二个JSON Input块.在第二个JSON块中,您需要检查源是在字段中定义的?",并将从字段中获取源"的值设置为"SizeBreakCosts",或在第一个JSON Input块中将其命名为任何值. /p>

同样,请确保已选中忽略缺少的路径"以及忽略空文件".从该块,我们将要获得两个字段.我们已经有了传入的每一行的ProductSKU和ProductName,第二个JSON Input步骤将进一步将其拆分为SizeBreakCosts输入JSON中的许多行.对于字段,请使用:

Name     | Path           | Type
Quantity | $.[*].quantity | Integer
Size     | $.[*].size     | String

如您所见,这些路径使用"$.[*].FieldName",因为我们传入的JSON字符串具有一个数组作为根项,因此我们获取该数组中的每个项并进行解析它的数量和大小.

现在,每一行都应具有父对象的SKU和名称,以及每个子对象的数量和大小.将这个示例转储到文本文件中,我得到了:

ProductSKU;ProductName;Size;Quantity
SK3579;Authority;S; 80
SK3579;Authority;M; 14
SK3579;Authority;L; 55

I am selecting nodes in a JSON input but can't find a way to include parent object detail for each array entry that I am querying. I am using pentaho data integration to query the data using JSON input form a mongodb input.

I have also tried to create a mongodb query to achieve the same but cannot seem to do this either.

Here are the two fields/paths that display the data:

$.size_break_costs[*].size $.size_break_costs[*].quantity

Here is the json source format:

{
"_id" : ObjectId("4f1f74ecde074f383a00000f"),
"colour" : "RAVEN-SMOKE",
"name" : "Authority",
"size_break_costs" : [
    {
        "quantity" : NumberLong("80"),
        "_id" : ObjectId("518ffc0697eee36ff3000002"),
        "size" : "S"
    },
    {
        "quantity" : NumberLong("14"),
        "_id" : ObjectId("518ffc0697eee36ff3000003"),
        "size" : "M"
    },
    {
        "quantity" : NumberLong("55"),
        "_id" : ObjectId("518ffc0697eee36ff3000004"),
        "size" : "L"
    }
],
"sku" : "SK3579"
}

I currently get the following results:

    S,80 
    M,14 
    L,55

I would like to get the SKU and Name as well as my source will have multiple products (SKU/Description):

    SK3579,Authority,S,80
    SK3579,Authority,M,14
    SK3579,Authority,L,55

When I try To include using $.sku, I the process errors.

The end result i'm after is a report of all products and the available quantities of their various sizes. Possibly there's an alternative mongodb query that provides this.

EDIT:

It seems the issue may be due to the fact that not all lines have the same structure. For example the above contains 3 sizes - S,M,L. Some products come in one size - PACK. Other come in multiple sizes - 28,30,32,33,34,36,38 etc.

The error produced is:

*The data structure is not the same inside the resource! We found 1 values for json path [$.sku], which is different that the number retourned for path [$.size_break_costs[].quantity] (7 values). We MUST have the same number of values for all paths.

I have tried the following mongodb query separately which gives the correct results, but the corresponding export of this doesn't work. No values are returned for the Size and Quantity.

Query:

db.product_details.find( {}, {sku: true, "size_break_costs.size": true, "size_break_costs.quantity": true}).pretty();

Export:

mongoexport --db brandscope_production --collection product_details --csv --out Test01.csv --fields sku,"size_break_costs.size","size_break_costs.quantity" --query '{}';

解决方案

Shortly after I added my own bounty, I figured out the solution. My problem has the same basic structure, which is a parent identifier, and some number N child key/value pairs for ratings (quality, value, etc...).

First, you'll need a JSON Input step that gets the SKU, Name, and size_break_costs array, all as Strings. The important part is that size_break_costs is a String, and is basically just a stringified JSON array. Make sure that under the Content tab of the JSON Input, that "Ignore missing path" is checked, in case you get one with an empty array or the field is missing for some reason.

For your fields, use:

Name           | Path               | Type
ProductSKU     | $.sku              | String
ProductName    | $.name             | String
SizeBreakCosts | $.size_break_costs | String

I added a "Filter rows" block after this step, with the condition "SizeBreakCosts IS NOT NULL", which is then passed to a second JSON Input block. This second JSON block, you'll need to check "Source is defined in a field?", and set the value of "Get source from field" to "SizeBreakCosts", or whatever you named it in the first JSON Input block.

Again, make sure "Ignore missing path" is checked, as well as "Ignore empty file". From this block, we'll want to get two fields. We'll already have ProductSKU and ProductName with each row that's passed in, and this second JSON Input step will further split it into however many rows are in the SizeBreakCosts input JSON. For fields, use:

Name     | Path           | Type
Quantity | $.[*].quantity | Integer
Size     | $.[*].size     | String

As you can see, these paths use "$.[*].FieldName", because the JSON string we passed in has an array as the root item, so we're getting every item in that array, and parsing out its quantity and size.

Now every row should have the SKU and name from the parent object, and the quantity and size from each child object. Dumping this example to a text file, I got:

ProductSKU;ProductName;Size;Quantity
SK3579;Authority;S; 80
SK3579;Authority;M; 14
SK3579;Authority;L; 55

这篇关于JSON路径父对象或等效的MongoDB查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆