数据库表设计与数据重复 [英] Database table design with duplication of data

查看:138
本文介绍了数据库表设计与数据重复的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

试图设计我的Firebase数据库结构时遇到了一些问题。基本上,1个帐户可以有多个收据,1个收据可以有多个项目。这是JSON:

 收据{
accountID1:{
receiptID1:{
date :07/07/2017
store:{
storeName:store1
storeAddr:addr1
}
货币:{
currencyName :currency1
currentcySymbol:$
}
totalAmount:50.00
items:{
itemID1:true,
itemID2:true,


receiptID2:{
date:08/07/2017
store:{
storeName:store1
storeAddr:addr1
}
货币:{
currencyName:currency1
currentcySymbol:$
}
totalAmount:20.00
项目:{
itemID3:true,
itemID4:true,
}
}
}
},
items {
itemID1:{
type:food
name:snack
unitprice:10.00
数量:2
}
itemID2 :{
type:entertainment
name:gaming equipment
unitprice:150.00
数量:1

itemID3: {
类型:食物
名称:果汁
单价:4.00
数量:1

itemID4:
类型:娱乐
名称:游戏设备
单价:150.00
数量:1
}
},
itemIDsByType {
food:{
itemID1:true,
itemID3:true,
}
entertainment:{
itemID2:true,
itemID4:true,
}
}

我意识到在项目子项下存在重复问题。例如,账户A在收货1中购买物品A.然后,账户A在收据2中再次购买相同的物品。在收据的小孩下,是的,这不会导致任何干扰。然而,通过查看项目子项,特别是 itemID2 itemID4 ,它们是同一个项目,但属于不同的收据。这两条记录是重复的,假设对于大量的数据,我认为这个设计可能会导致一个问题。



有关如何重构数据库设计以消除上述重复问题的想法?

实际上已经出来了另一种设计,但它是平坦的:

 收据{
accountID1:{
receiptID1:{
date:07/07/2017
merchantName:NTUC
branch:{
branchName:Marsiling
branchAddress:Blk 167,Marsiling

货币:{
currencyName:currency1
currencySymbol:$
}
totalAmount:50.00


receiptID2:{
date:08/07/2017
merchantName:NTUC
branch:{
branchName: Marsiling
branchAddress:Blk 167,Marsiling
}
货币:{
currencyName:currency1
currencySymbol:$
}
totalAmount:20.00
}
}
} ,

itemLists {
receiptID1:{
items:{
itemID1:{
type:food
name:snack
unitprice:10.00
数量:2
}

itemID2:{
类型:娱乐
名称:游戏设备
unitprice:150.00
数量:1
}

itemID3:{
type:food
名称:果汁
单价:4.00
数量:1
}
}
}

收据ID2:
物品:{
ite mID4:{
类型:娱乐
名称:游戏设备
单价:150.00
数量:1
}
}
}
},
itemIDsByType {
food:{
itemID1:true,
itemID3:true,
}
entertainment :{
itemID2:true,
itemID4:true,
}
},
merchants {
merchantID1:{
merchantName:NTUC
branches:{
branchID1:{
branchName:Marsiling
branchAddress:Blk 167,Marsiling
}
branchID2:{
branchName:Woodlands
branchAddress:Blk 161,Woodlands
}
}
}
}

至于这个设计,每个项目都是groupe d在每个ReceipID下,这样就可以消除上面提到的重复问题。但是我发现它不那么平坦,我试图在平坦的设计与重复的数据之间进行拼凑,或者在没有重复的数据的情况下进行平坦的设计。哪一个更适合大量的数据?

我们从一个主项目列表开始。

  item_0 
名称:burger
item_1
名称:taco
item_2
名称:hot dog
item_3
名称:fries
item_4
名称: refried beans

然后收据节点存储关于收据,日期,时间,客户名称等信息。注意这里没有提及这些项目,因为它们不是直接需要的,但可以为了方便而添加。

  receipt_0 
顾客:Frank
时间戳:170716093623
receipt_1
顾客:Bill
时间戳:170716094515

$ b

以及每张收据​​上的物品详细信息。
$ b $

  receipt_items :
-Y89jasjdiasd:
item_id:item_0
价格:5.00
数量:1
收据:receipt_0
-YHJis9asdasd:
item_id:item_3
价格:1.50
数量:1
收据:receipt_0
-Yn9kasdpaosd:
item_id:item_1
价格:2.00
数量:3
收据:receipt_1
-Yllois9040ka:
item_id :item_4
价格:1.50
数量:1
收据:receipt_1



<正如你所看到的,弗兰克在receipt_0上吃了一个汉堡包和炸薯条,比尔买了3个墨西哥卷饼(!),还有一面收据1上的豆子。


您可以获得每个收据,客户,日期等的详细信息。或者查询receipt_items节点的receipt_id,并获取它的项目的细节 - 项目,价格,数量等。

您还可以查询receipt_items节点的特定项目;然后总结说出最流行的数量或平均销售价格。

这消除了重复的项目和数据,并提供了一个可查询的,非规范化的结构。如上所述,您可以为每个收据添加一个子节点来存储receipt_items,但由于receipt_items是可查询的,因此可能不需要它。它可以用来订购收据上的物品。

注意:在receipt_items中的子节点键是用childByAutoId创建的。


I was having some problem when trying to design for my firebase database structure. Basically, 1 account can have many receipts, 1 receipts can have many items. Here is the JSON:

receipts {
    accountID1 : {
        receiptID1 : {
            date : "07/07/2017"
            store : {
                storeName : "store1"
                storeAddr : "addr1"
            }
            currency : {
                currencyName : "currency1"
                currentcySymbol : "$"
            }
            totalAmount : "50.00"
            items : {
                itemID1 : true,
                itemID2 : true,
            }
        }
        receiptID2 : {
            date : "08/07/2017"
                store : {
                    storeName : "store1"
                    storeAddr : "addr1"
                }
                currency : {
                    currencyName : "currency1"
                    currentcySymbol : "$"
                }
                totalAmount : "20.00"
                items : {
                    itemID3 : true,
                    itemID4 : true,
                }
        }
    }
},
items {
        itemID1 : {
            type : "food"
            name : "snack"
            unitprice : "10.00"
            quantity : "2"
        }
        itemID2 : { 
            type : "entertainment"
            name : "gaming equipment"
            unitprice : "150.00"
            quantity : "1"
        }
        itemID3 : { 
            type : "food"
            name : "fruit juice"
            unitprice : "4.00"
            quantity : "1"
        } 
        itemID4 : {
            type : "entertainment"
            name : "gaming equipment"
            unitprice : "150.00"
            quantity : "1"
        }
},
itemIDsByType {
    food : {
        itemID1 : true,
        itemID3 : true,
    }
    entertainment: {
        itemID2 : true,
        itemID4 : true,
    }
}

I realized there is a duplication problem under the items child. For instance, account A purchase item A in receipt 1. Then, account A purchase the same item again in receipt 2. Under the receipts child, yes that will not cause any interference.

However, by looking at items child, specifically for itemID2 and itemID4, they are same item but belonged to different receipt. These two records are duplicated, and let's say for large set of data, I think this design might cause a problem.

Any ideas on how to restructure the database design in order to remove the duplication problem mentioned above?

I have actually come out with another design but it is less-flatten:

receipts {
    accountID1 : {
        receiptID1 : {
            date : "07/07/2017"
            merchantName : "NTUC"
            branch : {
                branchName : "Marsiling"
                branchAddress : "Blk 167, Marsiling"
            }
            currency : {
                currencyName : "currency1"
                currencySymbol : "$"
            }
            totalAmount : "50.00"
        }

        receiptID2 : {
            date : "08/07/2017"
            merchantName : "NTUC"
            branch : {
                branchName : "Marsiling"
                branchAddress : "Blk 167, Marsiling"
            }
            currency : {
                currencyName : "currency1"
                currencySymbol : "$"
            }
            totalAmount : "20.00"
        }
    }
},

itemLists {
    receiptID1 : {
        items : {
            itemID1 : {
                type : "food"
                name : "snack"
                unitprice : "10.00"
                quantity : "2"
            }

            itemID2 : { 
                type : "entertainment"
                name : "gaming equipment"
                unitprice : "150.00"
                quantity : "1"
            }

            itemID3 : { 
                type : "food"
                name : "fruit juice"
                unitprice : "4.00"
                quantity : "1"
            } 
        }
    }

    receiptID2 : { 
        items : {
            itemID4 : {
                type : "entertainment"
                name : "gaming equipment"
                unitprice : "150.00"
                quantity : "1"
            }
        }
    }
},
itemIDsByType {
        food : {
            itemID1 : true,
            itemID3 : true,
        }
        entertainment: {
            itemID2 : true,
            itemID4 : true,
        }
},
merchants {
    merchantID1 : {
        merchantName : "NTUC"
        branches : {
            branchID1 : {
                branchName : "Marsiling"
                branchAddress : "Blk 167, Marsiling"
            }
            branchID2 : {
                branchName : "Woodlands"
                branchAddress : "Blk 161, Woodlands"
            }
        }
    }
}

As for this design, each of the items are grouped under each receiptID so that could eliminate the duplication problem mentioned above. But I find it less-flatten, I am trying to juggle between flatten design with duplicate data or less-flatten design with no duplicated data. Which one would be better for a large set of data?

解决方案

Let's start with a master items list. This list is ALL of the items available for sale.

item_0
  name: "burger"
item_1
  name: "taco"
item_2
  name: "hot dog"
item_3
  name: "fries"
item_4
  name: "refried beans"

Then the receipts node which stores info about the receipt, date, time, customer name etc. Note there are no references to the items as they are not directly needed, but could be added for convenience.

receipt_0
   customer: "Frank"
   timestamp: 170716093623
receipt_1
   customer: "Bill"
   timestamp: 170716094515

and finally the details of the items on each receipt.

receipt_items:
   -Y89jasjdiasd:
      item_id: item_0
      price: 5.00
      qty: 1
      receipt: receipt_0
   -YHJis9asdasd:
      item_id: item_3
      price: 1.50
      qty: 1
      receipt: receipt_0
   -Yn9kasdpaosd:
      item_id: item_1
      price: 2.00
      qty: 3
      receipt: receipt_1
   -Yllois9040ka:
      item_id: item_4
      price: 1.50
      qty: 1
      receipt: receipt_1

As you can see, Frank got a burger and fries on receipt_0 and Bill got 3 tacos (!) and a side of refried beans on receipt_1

With this structure, you can get the details of each receipt, customer, date etc. Or query the receipt_items node for a receipt_id and get the details of the items on it - item, price, qty etc.

You can also query the receipt_items node for a specific item; then sum up the quantities for say.. the most popular, or the average selling price.

This eliminates duplicate items AND data and provides a queryable, denormalized structure.

As mentioned above, you could add a child node to each receipt to store the receipt_items but since the receipt_items is queryable it may not be needed. It could be used to order the items on the receipt..

Note: the child node keys in receipt_items are created with childByAutoId.

这篇关于数据库表设计与数据重复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆