如何解决 mongo db w.r.t 中的这种设计约束对性能的影响? [英] How do I resolve this design constraint in mongo db w.r.t to performance?

查看:44
本文介绍了如何解决 mongo db w.r.t 中的这种设计约束对性能的影响?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前我有以下内容.

Collection1 - 系统

Collection1 - system

{
  _id: system_id,
  ... system fields
  system_name: ,
  system_site: ,
  system_group: ,
  ....
  device_errors: [1,2,3,4,5,6,7]
}

我有 2K 个唯一的错误代码.

I have 2K unique error codes.

我有一个错误集合,如下所示.

I have an error collection as below.

{
 _id: error_id,
 category,
 impact,
 action,
}

我有一个用例,其中每个 system|burt 组合都可以有唯一的 error_description,因为 error 有一些系统数据.

I have got a use case where each each system|burt combination can have unique error_description because error has some system data.

我很困惑如何在这种情况下处理这个问题.

I am confused how to handle this in this scenario.

 One system can have many errors. 
 One error can be part of multiple systems.

现在,如何维护特定于系统的突发的独特细节?我想在系统集合中有一个嵌套字段而不是数组.我想知道可扩展性.

Now, how to maintain the unique details of a burt specific to a system? I thought of having a nested field instead array in system collection. I am wondering about the scalability.

有什么建议吗?

 system1|burt1
    error_desc:unique system1

 system2|burt1
    error_Description: unique

如果我像上面那样存储在另一个集合中,API 请求必须进行三个调用并形成响应.

If I store like above in another collection, API request has to make three calls and form the response.

 1. Find all errors for set of systems
 2. Find top 50 burts from point1
 3. For top 50 burts, find error desc

组合所有三个呼叫响应并回复用户?

Combine all three call responses and reply to the user?

我认为这不是最好的,因为我们需要进行 3 次数据源调用来响应请求.

I am not thinking it is best as we need to make 3 data source calls to respond a request.

我已经尝试过使用冗余数据扁平化结构.

I have already tried flatten structure with redundant data.

{   
 ... system1_info
 ... error1_info
},
{
 ... system2_info  
 ... error1_info
},
{  
 ... system1_info
 ... error2_info
},
{
 ... system10_info  
 ... error1200_info
}

在这里,我在单个查询中使用了许多聚合,如下所示

Here, I am using many aggregation as below in single query

1. Match
2. Group error
3. Sort
4. total count of errors - another group
5. Project

我觉得这是一个比approach1[实际问题]更重的查询.

I feel it is a heavier query than the approach1[actual question].

假设我有 2k 错误,2000 万个系统 = 我总共有 4000 万个文档.在最坏的情况下,每个系统都有 2k 个错误.我的查询应该支持 1 个以上的系统.假设我必须查询 25k 系统.

Let's say I have 2k errors, 20million systems = I have totally 40million doc. In worst case each system has 2k errors. My query should support more than 1 system. Let's say I have to query for 25k systems.

  1. 25k 系统 * 2k 错误 => 匹配结果
  2. 应用上述所有操作
  3. 然后切片到 100[用于分页]

如果我使用没有冗余的关系模型,我将得到 25k 个系统,那么我只需要查询 2k 个错误 = 与上面的聚合相比,它的操作要少得多.

If I go with relational model like without redundancy, I will get 25k systems, then i have to query for only 2k errors = It is very less operation than above aggregation.

推荐答案

大概可能的错误集不会经常改变.将其缓存在应用程序中.

Presumably the set of possible errors does not change very frequently. Cache it in the application.

这篇关于如何解决 mongo db w.r.t 中的这种设计约束对性能的影响?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆