如何解决 mongo db w.r.t 中的这种设计约束对性能的影响? [英] How do I resolve this design constraint in mongo db w.r.t to performance?

查看：44 发布时间：2021/6/3 20:44:13 mongodb

本文介绍了如何解决 mongo db w.r.t 中的这种设计约束对性能的影响?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

目前我有以下内容.

Collection1 - 系统

Collection1 - system

{
  _id: system_id,
  ... system fields
  system_name: ,
  system_site: ,
  system_group: ,
  ....
  device_errors: [1,2,3,4,5,6,7]
}

我有 2K 个唯一的错误代码.

I have 2K unique error codes.

我有一个错误集合，如下所示.

I have an error collection as below.

{
 _id: error_id,
 category,
 impact,
 action,
}

我有一个用例，其中每个 system|burt 组合都可以有唯一的 error_description，因为 error 有一些系统数据.

I have got a use case where each each system|burt combination can have unique error_description because error has some system data.

我很困惑如何在这种情况下处理这个问题.

I am confused how to handle this in this scenario.

 One system can have many errors. 
 One error can be part of multiple systems.

现在，如何维护特定于系统的突发的独特细节?我想在系统集合中有一个嵌套字段而不是数组.我想知道可扩展性.

Now, how to maintain the unique details of a burt specific to a system? I thought of having a nested field instead array in system collection. I am wondering about the scalability.

有什么建议吗?

 system1|burt1
    error_desc:unique system1

 system2|burt1
    error_Description: unique

如果我像上面那样存储在另一个集合中，API 请求必须进行三个调用并形成响应.

If I store like above in another collection, API request has to make three calls and form the response.

 1. Find all errors for set of systems
 2. Find top 50 burts from point1
 3. For top 50 burts, find error desc

组合所有三个呼叫响应并回复用户?

Combine all three call responses and reply to the user?

我认为这不是最好的，因为我们需要进行 3 次数据源调用来响应请求.

I am not thinking it is best as we need to make 3 data source calls to respond a request.

我已经尝试过使用冗余数据扁平化结构.

I have already tried flatten structure with redundant data.

{   
 ... system1_info
 ... error1_info
},
{
 ... system2_info  
 ... error1_info
},
{  
 ... system1_info
 ... error2_info
},
{
 ... system10_info  
 ... error1200_info
}

在这里，我在单个查询中使用了许多聚合，如下所示

Here, I am using many aggregation as below in single query

1. Match
2. Group error
3. Sort
4. total count of errors - another group
5. Project

我觉得这是一个比approach1[实际问题]更重的查询.

I feel it is a heavier query than the approach1[actual question].

假设我有 2k 错误，2000 万个系统 = 我总共有 4000 万个文档.在最坏的情况下，每个系统都有 2k 个错误.我的查询应该支持 1 个以上的系统.假设我必须查询 25k 系统.

Let's say I have 2k errors, 20million systems = I have totally 40million doc. In worst case each system has 2k errors. My query should support more than 1 system. Let's say I have to query for 25k systems.

25k 系统 * 2k 错误 => 匹配结果
应用上述所有操作
然后切片到 100[用于分页]

如果我使用没有冗余的关系模型，我将得到 25k 个系统，那么我只需要查询 2k 个错误 = 与上面的聚合相比，它的操作要少得多.

If I go with relational model like without redundancy, I will get 25k systems, then i have to query for only 2k errors = It is very less operation than above aggregation.

如何解决 mongo db w.r.t 中的这种设计约束对性能的影响? [英] How do I resolve this design constraint in mongo db w.r.t to performance?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何解决 mongo db w.r.t 中的这种设计约束对性能的影响? [英] How do I resolve this design constraint in mongo db w.r.t to performance?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭