分布式主键-UUID,简单的自动增量还是自定义的顺序值? [英] Distributed primary key - UUID, simple auto increment or custom sequential values?

查看:164
本文介绍了分布式主键-UUID,简单的自动增量还是自定义的顺序值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道以前曾问过这种类型的问题,但是我找不到一个可以比较我所想到的选择的问题.因此,我要在此处发布它们,如果有重复,请发布链接.

这篇文章结束了很长的篇幅,如果您有时间请仔细阅读,因为问题在末尾

我接受了一个答案,因为我认为这将是目前的最佳解决方案.但是我想我想提出另外两个问题,回答有关串联数字的查询.可在以下位置找到它们:将两个整数组合起来以创建唯一数字& 在C#中连接整数.如果我要尝试编码数字(如下所示,例如51122222),我认为这将很有用.尽管也许在c#中使用String.Format之类的东西对于我的小型应用程序来说足够快.

I've accepted an answer as I think that will be the best solution for now. But I thought I would like to two other questions that answer my query about concatenating numbers. They can be found here: Combine two integers to create a unique number & Concatenate integers in C#. If I was going to try encoding the number (as below like 51122222) I think this would be useful. Though maybe just using something like String.Format in c# would be fast enough for my small application.

我目前正在尝试找到一种方法来设置使用相同数据库架构的分布式应用程序,并且可以与一个主数据库同步,而所有其他主数据库也可以与之同步.

I'm currently trying to find a way to setup distributed applications that use the same database schema and can synchronise with maybe one master database that all others also sync with.

我目前正在计划的程序将从一个非常简单的程序开始,以跟踪信息.第一个版本可能包含两个表:Items和ItemHistory.这是可能字段的示例:

The program I am planning currently will start as a fairly simple program to track information. The first version might contain two tables: Items and ItemHistory. This is an example of possible fields:

项目
ItemID(PK)吗?
名称字符串
内容字符串

Items
ItemID(PK) ?
Name String
Content String

ItemHistory
ItemHistoryID(PK)吗?
ItemID(FK)吗?
EventName字符串
CreatedOn DateTime

ItemHistory
ItemHistoryID (PK) ?
ItemID (FK) ?
EventName String
CreatedOn DateTime

我已经列出了字段名称和类型,这个问题是关于将什么用于PK类型,以便它们不存在.

I've listed the field name and type, this question is about what to use for the PK types so they are missing.

第一个版本将是一个标准的桌面应用程序,我目前正在计划将带有WPF前端的C#和SQLite用于数据库.最终,我也想创建一个同样可以在我的Android手机上运行的版本.这是分布式部分进入的地方.我并不总是有信号,因此需要该应用程序离线运行并允许再次在线时进行同步.

The first version will be a standard desktop app, I'm currently planning on using C# with a WPF front end and SQLite for the database. Eventually I also want to create a version to run on my Android phone as well. This is where the distributed part comes in. I don't always have a signal so will need the app to run offline and allow synchronisation when online again.

以下是我到目前为止如何处理ID的想法:

Here are the ideas I have so far on how to deal with the ID's:

  1. 使用UUID作为ID,因此不会发生合并冲突
  2. 使用自动增量字段,并以某种增量(例如,以第一个应用程序1个,第二个应用程序10000个,第三个应用程序20000个,等等
  3. 使用具有偏移量值的自动增量字段来避免冲突,而数字之间不会有较大的间隔(为此,mysql具有auto_increment_increment和auto_increment_offset)
  4. 生成我自己的ID,该ID对每个数据库的ID进行编码,以便它们可以具有自己的自动增量值,而不会引起冲突.我发现其他人也有相同的想法:什么数据建议为ID列使用哪种类型?
  1. Use a UUID for the IDs so there are no merge conflicts
  2. Use a auto increment field and set the starting number on each version of the app in some increment, e.g. 1 for first app, 10000 for second, 20000 for third etc
  3. Use a auto increment field with an offset value to avoid conflicts without the large gaps between numbers (mysql has auto_increment_increment and auto_increment_offset for this)
  4. Generate my own ID that encodes an ID for each database so they can have their own auto increment value and not cause a conflict. I found someone else that had the same idea: What data type is recommended for ID columns?

虽然选项1可以使用,而我过去曾经使用过它,但我想看看其他选项可以避免UUID问题的可能性.我希望有一个在调试时更易于阅读且可排序的解决方案.

While option 1 would work and I have used it in the past I want to look at the possibility of other options to avoid the issues with UUIDs. I would like to have a solution that is easier to read while debugging and is sortable.

选项2可以使用,但是它确实限制了记录数.我知道在我的小型应用程序中,它几乎永远不会超过那么多,但是我想尝试看看是否有不需要这种限制的解决方案.选项3通过使用交替数字来避免限制,但是我认为您需要知道要使用多少个数据库,否则您可能会填写所有数字.在DB1上使用1的开头和1的增量,在DB2上使用2的开头和2的增量将交替使用每个数字.您可以使用50作为增量,但是您还有另一个限制,但是现在限制了可以使用它的应用程序的数量.我再次知道它的限制在我的情况下不会受到影响,但在突然变得非常流行的应用程序中可能会成为一个问题.

Option 2 would work but it does force a limit on the number of records. I know in my small application it will almost never go over that many but I would like to try and see if there's a solution that does not require such a limit. Option 3 avoids the limit by using alternating numbers, but I think you would need to know how many database are to be used or you might fill all numbers otherwise. Using a start of 1 and increment of 1 on DB1 and start of 2 and increment of 2 on DB2 would use every number alternatively. You could use 50 as the increment but then you just have another limit but now its on the number of applications that can use it. Again I know its a limit that is not going to be hit in my situation but could be an issue in an application that suddenly becomes very popular.

选项4似乎可以为我解决问题,但是我不确定它在实践中是否可行.我的一个想法是允许在每个应用程序上设置一个前缀,然后可以将其与自动递增值一起使用.例如PC1,PC2用于记录PC上的记录,也许PHONE1,PHONE2等用于记录来自Android的记录.这会起作用,但是在字符串中使用数字会导致排序问题,其中1,11,100并排显示,即使用较少的前导零,然后再次将其返回到有限数量的记录.

Option 4 seems like it could solve the issue for me, but I'm not sure if it would work in practice or not. One idea I had was to allow a prefix to be set on each application then that could be used with an auto incrementing value. e.g. PC1, PC2 for records on a pc and maybe PHONE1, PHONE2 etc for records from the Android. This would work but using numbers in strings causes the sorting issue with 1, 11, 100 showing up next to each other, that is in less leading zeros are used and then its back to a limited number of records again.

我想知道是否可以将数字用于数据库ID和自动递增.例如PC = 1和PHONE = 2,那么PC就有11、12、13等,在PHONE上,第11个记录可能为111,第304个记录为2304.但是我不知道该怎么做,或者是否很容易做到,并且不会产生过多的开销来生成值.

I have wondered if it would be possible to use a number for the DB ID and the auto increment. e.g PC = 1 and PHONE = 2. then we have 11, 12, 13 etc for the PC with maybe 111 for the 11th record and 2304 for the 304th record on PHONE. But I don't know how this would be done or if it can easily be done and not cause excess overheads for generating values.

在工作中,他们使用了类似的编号系统,使用的是类似51122222的名称.5将引用应用程序的实例,然后是2位数字的年份,最后是一个自动递增的数字.我还没有一个明确的答案,如果我们一年超过99999条记录,将会发生什么.我认为他们可能已经意识到这种情况不会发生,并且很高兴他们已经计算出了风险.

At work they have used a similar numbering system, they use something like this 51122222. 5 would refer to the instance of the application, then its a 2 digit year and finally a auto incrementing number. I've not got a clear answer yet what happens if we go over 99999 records in a year. I think they might have figured that its not going to happen and are happy they have calculated the risk.

因此,最后一个问题是,是否有一种方法可以为分布式应用程序创建一个主键系统,该系统允许排序并且不强制限制(除了数据类型本身的大小,例如最大整数)?

So finally a question, is there a way to create a primary key system for a distributed application that allows for sorting and does not enforce limits (besides the size of the data type itself e.g. max integer)?

这是我计划编写的应用程序的更多信息.我想创建一些东西,使我可以存储几乎任何类型的信息,该系统将包含标记条目的功能,以便我可以搜索主题.到目前为止,我看到的信息类型可能是关于书籍,DVD,网站等的推荐.或者也许是我所居住的地方的本地提示.总体思路是停止使这些信息以不同的格式散布在多台计算机/笔记本电脑/电话中.

Here's a little bit more info on the app I plan to write. I want to create something that will let me store just about any type of information that I might gain, the system will include the ability to tag the entries so I can search on a topic. Types of information I see so far could be recommendations on books, dvds, websites etc. Or maybe local tips for the place I'm living. One overall idea is to stop keeping these bits of information spread across multiple computers/laptops/phones in different formats.

推荐答案

广义上讲,有两种方法.

in broad terms, there are two approaches.

  1. 您使用顺序值.这些可以分为几类,交错的,随便什么.它们是最有效的方法,但需要协作和协调.

  1. you use sequential values. these may be divided up into groups, interleaved, whatever. they are the most efficient approach, but require collaboration and coordination.

您使用随机值(包括UID).这些要简单得多,但需要更多空间.从生日冲突"中我们知道,如果您需要存储N个值,则必须从(大于)N * N范围中选择一个随机键-

you use random values (this includes UIDs). these are much simpler but require more space. from "birthday collisions" we know that you if you need to store N values then a random key must be chosen from (more than) a range of N*N - http://en.wikipedia.org/wiki/Birthday_problem. working backwards, a 64 bit integer can hold about 32 bits of data if used as a random key - that's about 4 billion values. but that's for a probability of 50% collisions. you want a much lower probability, so a practical limit is around 10 million entries.

因此,简单来说,如果您拥有64位密钥,那么随机方法将适用于大约一千万个条目,而顺序方法则适用于更多方法.无论哪种情况,这都可能超出您的需要.

so, in simple terms, if you have a 64 bit key, a random approach would work for around 10 million entries a sequential approach for many more. in either case, that is probably more than you need.

如果您拥有32位密钥,那么随机方法可处理大约一千个值(如上所述,顺序方法将达到约40亿个值).

if you have a 32 bit key then a random approach works for around a thousand values (a sequential approach goes to about 4 billion, as above).

很显然,如果您有文本值,则需要进行相应的修改,但是UUID的设计目的是无论如何都要具有足够的"值 http://en.wikipedia.org/wiki/Universally_unique_identifier

obviously if you have a text value then you need to modify this accordingly, but UUIDs are designed to have "enough" values anyway http://en.wikipedia.org/wiki/Universally_unique_identifier

通常,数据库将提供一个顺序ID,这就是您所需要的.如果不是这样,则64位随机方法通常是最简单的,值得额外的空间.

typically a database will provide a sequential ID and that is all you need. if not, the 64 bit random approach is usually simplest and worth the extra space.

这篇关于分布式主键-UUID,简单的自动增量还是自定义的顺序值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆