什么是快速的数据结构来实现一个对存储,这里指定? [英] What is a fast data-structure to implement a pair storage, as specified here?
问题描述
我正在尝试实现一对值对的存储。它必须暴露一个简单的API: newPair,get,remove,isMember
。值可以由 4位int
或指向另一对的指针组成。所以,例如:
指针a = API.newPair(13,12);
指针b = API.newPair(a,7);
指针c = API.newPair(2,b);
指针d = API.newPair(a,7);
等等。只有一个警告:一对绝对不能存储两次。如果您尝试添加已存在的对,则只返回现有指针。所以,例如,上面的代码, b == d
。
我的天真的实现涉及一堆对。每次添加一个新对,我只是迭代堆栈,如果我在那里找到了对,我返回了它的位置。如果没有,我添加了一个新的元素到堆栈并返回堆栈大小。现在我需要使它尽可能快。我目前的想法是使用哈希表而不是堆栈。
所以,问题是:实现这一点的快速方法是什么?
由可以使用
x> = y? x * x + x + y:x + y * y;其中x,y> = 0
通过沿着一个边缘的边缘分配值来工作。 / p>
代码
int cantor_pairing(int k1,int k2){
返回0.5 *(k1 + k2)*(k1 + k2 + 1)+ k2;
}
int szuzik_pairing(int x,int y){
if(x <0 || y <0)return -1;
return x> = y? x * x + x + y:x + y * y;
}
即使你的整数是联合统一分配,你不能只是使用映射
(x,y) - > x
ie完全忽视了他们的第二个。这是因为在一个平方NxN上,f(x,y)将为1 / N ^ 2,但边际分布f_y(x)= 1 / N,这将是碰撞的概率。
毕竟,最可能的情况并非如此,因为用户会更频繁地选择小数字,然后选择较小的数字。
I am trying to implement a storage of pairs of values. It must expose a simple API: newPair, get, remove, isMember
. A value can consist of either a 4-bit int
or a pointer to another pair. So, for example:
Pointer a = API.newPair(13,12);
Pointer b = API.newPair(a,7);
Pointer c = API.newPair(2,b);
Pointer d = API.newPair(a,7);
And so on. There is just one caveat, though: a pair must never be stored twice. If you try to add a pair that already exists, it just returns the existing pointer. So, for example, on the code above, b == d
.
My naive implementation involved a stack of pairs. Each time a new pair was added, I just iterated the stack and, if I found the pair there, I returned its position. If not, I added a new element to the stack and returned the stack size. Now I need to make it as fast as possible. My current idea is to just use a hash-table instead of a stack.
So, the question is: what is a fast way to implement this?
A bijection NxN -> N given by Cantor Pairing Function
can be used if N is an infinite set. It works by consecutive assigning elements of NxN to N this way:
When x,y are non negative integers the mapping introduced in ElegantPairing may be used
x >= y ? x * x + x + y : x + y * y; where x, y >= 0
which works by assigning values along the edges of a squere.
The code
int cantor_pairing( int k1, int k2) {
return 0.5 * ( k1 + k2) * ( k1 + k2 + 1) + k2;
}
int szuzik_pairing( int x, int y) {
if( x < 0 || y < 0) return -1;
return x >= y ? x * x + x + y : x + y * y;
}
Even if your integers were joint uniformly distributed you couldn't have been used just the mapping
(x,y) -> x
i.e. completely ignoring the second of them. This is because on a square NxN f(x,y) would be 1/N^2 but marginal distribution f_y(x) = 1/N and this would be a probability of a collision. After all, most probably this is not the case, because user will more frequently choose small numbers then large.
这篇关于什么是快速的数据结构来实现一个对存储,这里指定?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!