获取文档的查找数组计数 [英] Get looked up array count for a document

查看：43 发布时间：2021/6/3 20:43:13 php mongodb

本文介绍了获取文档的查找数组计数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有 2 个集合:单词和短语每个单词文档都有一组短语 id.每个短语都可以是活动的或非活动的.

i have 2 collections : words and phrases Each word document has an array of phrases id's. And each phrase can be active or inactive.

例如:

词 :
{"word" => "hello", 短语 => [1,2]}
{"word" => "table", 短语 => [2]}

短语 :
{"id" => 1, "phrase" => "hello world!", "active" => 1}
{"id" => 2, "phrase" => "hello,我已经买了新表", "active" => 0}

For example :

words :
{"word" => "hello", phrases => [1,2]}
{"word" => "table", phrases => [2]}

phrases :
{"id" => 1, "phrase" => "hello world!", "active" => 1}
{"id" => 2, "phrase" => "hello, i have already bought new table", "active" => 0}

我需要计算每个单词的活跃短语数.

I need to get count of active phrases for each word.

在 php 中，我是这样做的:
1. 获取所有单词
2. 对于每个单词，获取条件为 ['active' => 1]

的活跃短语数问题: 我怎样才能在一个请求中获得包含活跃短语的单词?我尝试使用 MapReduce，但我需要为每个单词发出请求以获取活动短语的数量.

更新:在我的测试集中有 92 000 个短语和 23 000 个单词.

In php i do it like this:
1. get all words
2. for each word get count of active phrases with condition ['active' => 1]

Question: How can i get words with active phrases count in one request? I tried to use MapReduce, but i need to make a request for each word to get count of active phrases.

UPD: In my test collection there are 92 000 phrases and 23 000 words.

我已经测试了这两种变体:对每个单词使用 php 循环，我在 mongo 中获得短语计数和聚合函数.

I have already tested both variant: with php loop for each word in which i get phrases count and aggreagation function in mongo.

但是由于phrases_data，我在下面的commets中更改了聚合管道.它是数组，所以我不能在它上面使用 $match.我在 $lookup 之后使用 $unwind.

But i changed aggregation pipeline in commets below because of phrases_data. It is array, so i can't use $match on it. I use $unwind after $lookup.

[ '$unwind'  =>  '$5'],
    [
        '$lookup' =>  [
        'from' =>  'phrases_926ee3bc9fa72b029e028ec90e282072ea0721d1',
            'localField' =>  '5',
            'foreignField' =>  '0',
            'as' =>  'phrases_data'
        ]
    ],
    [ '$unwind'  =>  '$phrases_data'],
    [ '$match'  =>  [ 'phrases_data.3'  =>  77] ], //phrases_data.3 => 77 it is similar to phrases_data.active => 1
    [ '$group'  =>  
        [
            '_id'  =>  ['word'  =>  '$1', 'id'  =>  '$0'],
            'active_count'  =>  [ '$sum'  =>  1]
        ]
    ],
    [ '$match'  =>  [ 'active_count'  =>  ['$gt' => 0]] ],
    [ '$sort'  =>
        [
            'active_count'  => -1
        ]
    ]

问题是 $group 命令占用了 80% 的处理时间.而且它比php循环慢得多.这是我的测试收集结果:

The problem is that $group command take 80% of process time. And it is much slower than php loop. Here is my results for test collection:

1. Php loop (get words-> get phrases count for each word): 10 seconds
2. Aggregation function : 20 seconds

获取文档的查找数组计数 [英] Get looked up array count for a document

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

获取文档的查找数组计数 [英] Get looked up array count for a document

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭