MongoDB数据库结构和最佳实践帮助 [英] MongoDB Database Structure and Best Practices Help

查看:146
本文介绍了MongoDB数据库结构和最佳实践帮助的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为垃圾收集公司开发路线追踪/优化软件,并想了解我目前的资料结构/情况。



这里是我的MongoDB结构的简化版本:



数据库:数据



strong>集合:



客户 - 包含所有客户数据的数据集。

  [
{
cust_id:1001,
name:Customer 1,
address:123 Fake St,
city:Boston
},
{
cust_id:1002,
:Customer 2,
address:123 Real St,
city:Boston
},
{
cust_id :1003,
name:Customer 3,
address:12 Elm St,
city:Boston
},
{
cust_id:1004,
name:Customer 4,
address:16 Union St,
city :Boston
},
{
cust_id:1005,
name:Customer 5,
address:13 Massachusetts Ave,
city:Boston
},{...},{...},...
]
卡车 - 包含所有卡车数据的数据。

  [
{
truckid:21,
type:Refuse,
year 2011,
make:Mack,
model:TerraPro Cabover,
body:Mcneilus Rear Loader XC,
capacity :25 cubic yards
},
{
truckid:22,
type:Refuse,
year 2009,
make:Mack,
model:TerraPro Cabover,
body:Mcneilus Rear Loader XC,
capacity :25 cubic yards
},
{
truckid:12,
type:Dump,
year Rugby Hydraulic Dump,
capacity:2006,
make:Chevrolet,
capacity:C3500 HD :15 cubic yards
}
]

- 包含所有驱动程序数据的数据收集。

  [
{
driverid :1234,
name:John Doe
},
{
driverid:4321,
name:Jack Smith
},
{
driverid:3421,
name:Don Johnson
}
]

route-lists - 包含所有预定路由列表的数据集。

  [
{
route_name:monday_1,
day:monday ,
truck:21,
stops:[
{
cust_id:1001
},
{
cust_id:1010
},
{
cust_id:1002
}
]
} $ b {
route_name:friday_1,
day:friday,
truck:12,
stops:[
{
cust_id:1003
},
{
cust_id:1004
},
{
cust_id:1012
}
]
}
]

routes - 包含所有活动和完成路线的数据。

  [
{
routeid:1,
route_name:monday1,
start_time:04:31 AM,
status:active,
stops:[
{
customerid:1001,
status:complete $ bstart_time:04:45 AM,
finish_time:04:48 AM,
elapsed_time:3
}
customerid:1010,
status:complete,
start_time:04:50 AM,
finish_time 52,
elapsed_time:2
},
{
customerid:1002,
status:incomplete
start_time:,
finish_time:,
elapsed_time:
},
{
customerid :1005,
status:incomplete,
start_time:,
finish_time:,
elapsed_time: b $ b}
]
}
]

到目前为止的过程:



每天的驱动程序从启动新路线开始。在开始新路线之前,驾驶员必须先输入资料:


  1. driverid

  2. < <> p>正确输入所有数据后,开始新路线将开始:


    1. 在集合中创建新对象routes

    2. 日期查询收集路由列表匹配和返回停止

    3. 路由列表数据插入路由

    当司机继续每日停止/任务时,路线集合将相应更新。



    完成所有任务后,驾驶员将能够通过简单地将状态字段更改为活动从完成在路线



    这就是总结。非常感谢任何反馈,意见,评论,链接,优化策略。



    提前感谢您的时间。

    解决方案

    你的数据库模式对我来说像经典关系数据库模式。 Mongodb很适合数据反规范化。我想当你显示你加载所有相关的客户,司机,卡车的路线。



    如果你想让你的系统真的快,你可以嵌入路由收集的一切。 >

    所以我建议你修改你的模式:


    1. li>


    2. route-list:



      在站内嵌入有关客户的数据,而不是参考。也嵌入卡车。在这种情况下,模式将是:

        {
      route_name:monday_1,
      day :monday,
      truck:{
      _id = 1,
      //这里将是所有卡车数据
      },
      stops {
      customer:{
      _id = 1,
      //这里将是所有客户数据
      }
      },{
      customer {
      _id = 2,
      //这里将是所有客户数据
      }
      }]
      }
      pre>

    3. 路由:



      当路由器从路由列表中启动新路由复制路由embedd驱动程序信息:

        {
      //复制所有路由列表数据并且保留对routes-list的引用。在这种情况下,您将能够同步路由与路由列表。)
      _id:1,
      route_list_id:1,
      start_time :04:31 AM,
      status:active,
      driver:{
      //嵌入所有驱动程序数据
      },
      :[{
      customer:{
      //所有客户数据
      },
      status:complete,
      start_time:04 :45 AM,
      finish_time:04:48 AM,
      elapsed_time:3
      }]
      }


    我想你问自己如果驱动程序,客户或其他非标准化数据主要收藏。是的,你需要更新其他集合中的所有非规范化数据。你可能需要更新数十亿的文档(取决于你的系统大小),没关系。



    在上述数据结构中有什么优点?


    1. 每个文档包含您可能需要在应用程序中显示的所有数据。因此,例如,当您需要显示路由时,您不需要加载相关的客户,驱动程序,卡车。

    2. 您可以对数据库进行任何困难的查询。例如,在您的模式中,您可以构建查询,将返回在name =Bill的客户停止中包含停靠点的所有路线(您需要先按名称加载客户,获取id,然后按照当前模式查看客户ID)。

    可能你会问自己,在某些情况下你的数据可能不同步,但为了解决这个问题,你只需要建立几个单元测试



    希望上面的内容将帮助您从文档数据库的角度来看非世界。


    I'm in the process of developing Route Tracking/Optimization software for my refuse collection company and would like some feedback on my current data structure/situation.

    Here is a simplified version of my MongoDB structure:

    Database: data

    Collections:

    "customers" - data collection containing all customer data.

      [
        {
            "cust_id": "1001",
            "name": "Customer 1",
            "address": "123 Fake St",
            "city": "Boston"
        },
        {
            "cust_id": "1002",
            "name": "Customer 2",
            "address": "123 Real St",
            "city": "Boston"
            },
        {
            "cust_id": "1003",
            "name": "Customer 3",
            "address": "12 Elm St",
            "city": "Boston"
        },
        {
            "cust_id": "1004",
            "name": "Customer 4",
            "address": "16 Union St",
            "city": "Boston"
            },
        {
            "cust_id": "1005",
            "name": "Customer 5",
            "address": "13 Massachusetts Ave",
            "city": "Boston"
        }, { ... }, { ... }, ...
    ]
    

    "trucks" - data collection containing all truck data.

    [
        {
            "truckid": "21",
            "type": "Refuse",
            "year": "2011",
            "make": "Mack",
            "model": "TerraPro Cabover",
            "body": "Mcneilus Rear Loader XC",
            "capacity": "25 cubic yards"
        },
        {
            "truckid": "22",
            "type": "Refuse",
            "year": "2009",
            "make": "Mack",
            "model": "TerraPro Cabover",
            "body": "Mcneilus Rear Loader XC",
            "capacity": "25 cubic yards"
        },
        {
            "truckid": "12",
            "type": "Dump",
            "year": "2006",
            "make": "Chevrolet",
            "model": "C3500 HD",
            "body": "Rugby Hydraulic Dump",
            "capacity": "15 cubic yards"
        }
    ]
    

    "drivers" - data collection containing all driver data.

      [
        {
            "driverid": "1234",
            "name": "John Doe"
        },
        {
            "driverid": "4321",
            "name": "Jack Smith"
        },
        {
            "driverid": "3421",
            "name": "Don Johnson"
        }
    ]
    

    "route-lists" - data collection containing all predetermined route lists.

       [
        {
            "route_name": "monday_1",
            "day": "monday",
            "truck": "21",
            "stops": [
                {
                    "cust_id": "1001"
                },
                {
                    "cust_id": "1010"
                },
                {
                    "cust_id": "1002"
                }
            ]
        },
        {
            "route_name": "friday_1",
            "day": "friday",
            "truck": "12",
            "stops": [
                {
                    "cust_id": "1003"
                },
                {
                    "cust_id": "1004"
                },
                {
                    "cust_id": "1012"
                }
            ]
        }
    ]
    

    "routes" - data collections containing data for all active and completed routes.

    [
        {
            "routeid": "1",
            "route_name": "monday1",
            "start_time": "04:31 AM",
            "status": "active",
            "stops": [
                {
                    "customerid": "1001",
                    "status": "complete",
                    "start_time": "04:45 AM",
                    "finish_time": "04:48 AM",
                    "elapsed_time": "3"
                },
                {
                    "customerid": "1010",
                    "status": "complete",
                    "start_time": "04:50 AM",
                    "finish_time": "04:52 AM",
                    "elapsed_time": "2"
                },
                {
                    "customerid": "1002",
                    "status": "incomplete",
                    "start_time": "",
                    "finish_time": "",
                    "elapsed_time": ""
                },
                {
                    "customerid": "1005",
                    "status": "incomplete",
                    "start_time": "",
                    "finish_time": "",
                    "elapsed_time": ""
                }
            ]
        }
    ]
    

    Here is the process thus far:

    Each day drivers begin by Starting a New Route. Before starting a new route drivers must first input data:

    1. driverid
    2. date
    3. truck

    Once all data is entered correctly the Start a New Route will begin:

    1. Create new object in collection "routes"
    2. Query collection "route-lists" for "day" + "truck" match and return "stops"
    3. Insert "route-lists" data into "routes" collection

    As driver proceeds with his daily stops/tasks the "routes" collection will update accordingly.

    On completion of all tasks the driver will then have the ability to Complete the Route Process by simply changing "status" field to "active" from "complete" in the "routes" collection.

    That about sums it up. Any feedback, opinions, comments, links, optimization tactics are greatly appreciated.

    Thanks in advance for your time.

    解决方案

    You database schema looks like for me as 'classic' relational database schema. Mongodb good fit for data denormaliztion. I guess when you display routes you loading all related customers, driver, truck.

    If you want make your system really fast you may embedd everything in route collection.

    So i suggest following modifications of your schema:

    1. customers - as-is
    2. trucks - as-is
    3. drivers - as-is
    4. route-list:

      Embedd data about customers inside stops instead of reference. Also embedd truck. In this case schema will be:

       {
           "route_name": "monday_1",
           "day": "monday",
           "truck": {
               _id = 1,
               // here will be all truck data
           },
           "stops": [{
               "customer": {
                   _id = 1,
                   //here will be all customer data
               }
           }, {
               "customer": {
                   _id = 2,
                   //here will be all customer data
               }
           }]
       }
      

    5. routes:

      When driver starting new route copy route from route-list and in addition embedd driver information:

       {
           //copy all route-list data (just make new id for the current route and leave reference to routes-list. In this case you will able to sync route with route-list.)
           "_id": "1",
           route_list_id: 1,
           "start_time": "04:31 AM",
           "status": "active",
           driver: {
               //embedd all driver data here
           },
           "stops": [{
               "customer": {
                   //all customer data
               },
               "status": "complete",
               "start_time": "04:45 AM",
               "finish_time": "04:48 AM",
               "elapsed_time": "3"
           }]
       }
      

    I guess you asking yourself what do if driver, customer or other denormalized data changed in main collection. Yeah, you need update all denormalized data within other collections. You will probably need update billions of documents (depends on your system size) and it's okay. You can do it async if it will take much time.

    What benfits in above data structure?

    1. Each document contains all data that you may need to display in your application. So, for instance, you no need load related customers, driver, truck when you need display routes.
    2. You can make any difficult queries to your database. For example in your schema you can build query that will return all routes thats contains stops in stop of customer with name = "Bill" (you need load customer by name first, get id, and look by customer id in your current schema).

    Probably you asking yourself that your data can be unsynchronized in some cases, but to solve this you just need build a few unit test to ensure that you update your denormolized data correctly.

    Hope above will help you to see the world from not relational side, from document database point of view.

    这篇关于MongoDB数据库结构和最佳实践帮助的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆