.net集合,用于快速插入/删除 [英] .net collection for fast insert/delete
问题描述
我需要维护一个连接客户端的名单,这些客户端的寿命很短,并且经常上下波动。由于潜在的客户数量,我需要一个支持快速插入/删除的集合。
I need to maintain a roster of connected clients that are very shortlived and frequently go up and down. Due to the potential number of clients I need a collection that supports fast insert/delete. Suggestions?
推荐答案
C5通用集合库
我最好的实现在C#和C ++中找到的是这些-对于C#/ CLI:
C5 Generic Collection Library
The best implementations I have found in C# and C++ are these -- for C#/CLI:
- http://www.itu.dk/research/c5/Release1.1/ITU-TR-2006-76.pdf
- http://www.itu.dk/research/c5/
经过充分研究,具有可扩展的单元测试,并且自2月以来,它们在.Net中实现了通用接口,这使得使用集合变得更加容易。它们在 Channel9 ,他们已经对集合进行了广泛的性能测试。
It's well researched, has extensible unit tests, and since February they also have implemented the common interfaces in .Net which makes it a lot easier to work with the collections. They were featured on Channel9 and they've done extensive performance testing on the collections.
如果您仍在使用数据结构,则这些研究人员的 red-black-tree 实现,类似于您启动Lütz反射器并拥有看看System.Data的内部结构:p。插入复杂度:O(log(n))。
If you are using data-structures anyway these researchers have a red-black-tree implementation in their library, similar to what you find if you fire up Lütz reflector and have a look in System.Data's internal structures :p. Insert-complexity: O(log(n)).
然后,如果您可以允许进行某些C ++互操作,而您绝对需要速度和需求Dmitriy V'jukov提供的这些无锁ADT尽可能地少了开销,因此可能是当今世界上最好的,优于英特尔并发的ADT库。
Then, if you can allow for some C++ interop and you absolutely need the speed and want as little overhead as possible, then these lock-free ADTs from Dmitriy V'jukov are probably the best you can get in this world, outperforming Intel's concurrent library of ADTs.
- http://groups.google.com/group/lock-free
我已经阅读了一些代码,真正精通这些东西如何组合的人的品格。 VC ++可以进行本机C ++互操作而不会产生烦人的边界。 http://www.swig.org/ 可以帮助您包装C ++接口以供使用在.Net中,或者您也可以通过P / Invoke自己进行操作。
I've read some of the code and it's really the makings of someone well versed in how these things are put together. VC++ can do native C++ interop without annoying boundaries. http://www.swig.org/ can otherwise help you wrap C++ interfaces for consumption in .Net, or you can do it yourself through P/Invoke.
教程,在C#中实现了一个不太粗略的跳过列表,并且讨论其他类型的数据结构。 (在CodeProject中有一个更好的 SkipList ,该代码非常优美,并在他们也有一些与.Net捆绑在一起的数据结构,即 HashTable / Dictionary<,> 和 HashSet 。当然,还有 ResizeArray / List类型以及堆栈和队列,但它们在搜索时都是线性的。
They have written tutorials, this one implementing a rather unpolished skip-list in C#, and discussing other types of data-structures. (There's a better SkipList at CodeProject, which is very polished and implement the interfaces in a well-behaved manner.) They also have a few data-structures bundled with .Net, namely the HashTable/Dictionary<,> and HashSet. Of course there's the "ResizeArray"/List type as well together with a stack and queue, but they are all "linear" on search.
如果您希望加快内存分配的时间,则可以使用Google的性能工具。它们在Google代码中可用,并且包含非常有趣的多线程malloc实现(TCMalloc) 显示的时序要比普通的malloc更一致。您可以将其与上面的无锁结构结合使用,以真正提高性能。
If you wish to speed up the time it takes for memory-allocation you can use google's perf-tools. They are available at google code and they contain a very interesting multi-threaded malloc-implementation (TCMalloc) which shows much more consistent timing than the normal malloc does. You could use this together with the lock-free structures above to really go crazy with performance.
您还可以在功能上使用记忆,以通过缓存提高性能,如果您正在使用例如有趣的东西,例如 F#。 F#还允许C ++互操作,所以您就可以了。
You can also use memoization on functions to improve performance through caching, something interesting if you're using e.g. F#. F# also allows C++ interop, so you're OK there.
使用在 bloom-filters 上完成的研究,可以自己做某事的可能性O(k)查找复杂度,其中k是一个常数,取决于您已实现的哈希函数的数量。谷歌的BigTable就是这样实现的。这些过滤器将为您提供元素(如果它在集合中,或者可能具有非常低的相似性),这不是您要查找的元素(请参阅Wikipedia上的图-它接近P(wrong_key)-> 0.01作为大小大约有10000个元素,但是您可以通过实现进一步的散列函数/减少集合来解决这个问题。
There's also the possibility of doing something on your own using the research which has been done on bloom-filters, which allow O(k) lookup complexity where k is a constant that depends on the number of hash-functions you have implemented. This is how google's BigTable has been implemented. These filter will get you the element if it's in the set or possibly with a very low likeliness an element which is not the one you're looking for (see the graph at wikipedia -- it's approaching P(wrong_key) -> 0.01 as size is around 10000 elements, but you can go around this by implementing further hash-functions/reducing the set.
我没有在搜索.Net实现,但是由于哈希计算是独立的,因此您可以使用 MS的性能团队执行任务加快速度。
I haven't searched for .Net implementations of this, but since the hashing calculations are independent you can use MS's performance team's implementation of Tasks to speed that up.
碰巧的是,我只是做了一个涉及数据结构的课程,在这种情况下我们使用了C ++,但是很容易转换为C#,我们构建了三种不同的数据结构;一个Bloom过滤器,一个跳过列表。和随机二进制搜索树。
As it happens I just did a coursework involving data-structures. In this case we used C++, but it's very easy to translate to C#. We built three different data-structures; a bloom-filter, a skip-list and random binary search tree.
请参见最后一段之后的代码和分析。
See the code and analysis after the last paragraph.
最后,要使我的回答完整,如果您确实需要速度,则可以使用路由表或内容可寻址内存。这样一来,您就可以非常快速地从原则上以O(1)的形式对数据进行哈希到值的查找。
Finally, to make my answer "complete", if you truly need speed you can use something like Routing-tables or Content-addressable memory . This allows you to very quickly O(1) in principle get a "hash"-to-value lookup of your data.
如果您发现其中的错误,我将非常感谢您提供反馈代码,或者只是关于如何更好(或更好地使用模板)的指针。请注意,Bloom过滤器并不像现实生活中那样。通常,您不必从中删除它,然后它比我允许测试删除的黑客程序要有效得多。
I would really appreciate feedback if you find mistakes in the code, or just pointers on how I can do it better (or with better usage of templates). Note that the bloom filter isn't like it would be in real life; normally you don't have to be able to delete from it and then it much much more space efficient than the hack I did to allow the delete to be tested.
DataStructure.h
#ifndef DATASTRUCTURE_H_
#define DATASTRUCTURE_H_
class DataStructure
{
public:
DataStructure() {countAdd=0; countDelete=0;countFind=0;}
virtual ~DataStructure() {}
void resetCountAdd() {countAdd=0;}
void resetCountFind() {countFind=0;}
void resetCountDelete() {countDelete=0;}
unsigned int getCountAdd(){return countAdd;}
unsigned int getCountDelete(){return countDelete;}
unsigned int getCountFind(){return countFind;}
protected:
unsigned int countAdd;
unsigned int countDelete;
unsigned int countFind;
};
#endif /*DATASTRUCTURE_H_*/
Key.h
#ifndef KEY_H_
#define KEY_H_
#include <string>
using namespace std;
const int keyLength = 128;
class Key : public string
{
public:
Key():string(keyLength, ' ') {}
Key(const char in[]): string(in){}
Key(const string& in): string(in){}
bool operator<(const string& other);
bool operator>(const string& other);
bool operator==(const string& other);
virtual ~Key() {}
};
#endif /*KEY_H_*/
Key.cpp
#include "Key.h"
bool Key::operator<(const string& other)
{
return compare(other) < 0;
};
bool Key::operator>(const string& other)
{
return compare(other) > 0;
};
bool Key::operator==(const string& other)
{
return compare(other) == 0;
}
BloomFilter.h
#ifndef BLOOMFILTER_H_
#define BLOOMFILTER_H_
#include <iostream>
#include <assert.h>
#include <vector>
#include <math.h>
#include "Key.h"
#include "DataStructure.h"
#define LONG_BIT 32
#define bitmask(val) (unsigned long)(1 << (LONG_BIT - (val % LONG_BIT) - 1))
// TODO: Implement RW-locking on the reads/writes to the bitmap.
class BloomFilter : public DataStructure
{
public:
BloomFilter(){}
BloomFilter(unsigned long length){init(length);}
virtual ~BloomFilter(){}
void init(unsigned long length);
void dump();
void add(const Key& key);
void del(const Key& key);
/**
* Returns true if the key IS BELIEVED to exist, false if it absolutely doesn't.
*/
bool testExist(const Key& key, bool v = false);
private:
unsigned long hash1(const Key& key);
unsigned long hash2(const Key& key);
bool exist(const Key& key);
void getHashAndIndicies(unsigned long& h1, unsigned long& h2, int& i1, int& i2, const Key& key);
void getCountIndicies(const int i1, const unsigned long h1,
const int i2, const unsigned long h2, int& i1_c, int& i2_c);
vector<unsigned long> m_tickBook;
vector<unsigned int> m_useCounts;
unsigned long m_length; // number of bits in the bloom filter
unsigned long m_pockets; //the number of pockets
static const unsigned long m_pocketSize; //bits in each pocket
};
#endif /*BLOOMFILTER_H_*/
BloomFilter.cpp
#include "BloomFilter.h"
const unsigned long BloomFilter::m_pocketSize = LONG_BIT;
void BloomFilter::init(unsigned long length)
{
//m_length = length;
m_length = (unsigned long)((2.0*length)/log(2))+1;
m_pockets = (unsigned long)(ceil(double(m_length)/m_pocketSize));
m_tickBook.resize(m_pockets);
// my own (allocate nr bits possible to store in the other vector)
m_useCounts.resize(m_pockets * m_pocketSize);
unsigned long i; for(i=0; i< m_pockets; i++) m_tickBook[i] = 0;
for (i = 0; i < m_useCounts.size(); i++) m_useCounts[i] = 0; // my own
}
unsigned long BloomFilter::hash1(const Key& key)
{
unsigned long hash = 5381;
unsigned int i=0; for (i=0; i< key.length(); i++){
hash = ((hash << 5) + hash) + key.c_str()[i]; /* hash * 33 + c */
}
double d_hash = (double) hash;
d_hash *= (0.5*(sqrt(5)-1));
d_hash -= floor(d_hash);
d_hash *= (double)m_length;
return (unsigned long)floor(d_hash);
}
unsigned long BloomFilter::hash2(const Key& key)
{
unsigned long hash = 0;
unsigned int i=0; for (i=0; i< key.length(); i++){
hash = key.c_str()[i] + (hash << 6) + (hash << 16) - hash;
}
double d_hash = (double) hash;
d_hash *= (0.5*(sqrt(5)-1));
d_hash -= floor(d_hash);
d_hash *= (double)m_length;
return (unsigned long)floor(d_hash);
}
bool BloomFilter::testExist(const Key& key, bool v){
if(exist(key)) {
if(v) cout<<"Key "<< key<<" is in the set"<<endl;
return true;
}else {
if(v) cout<<"Key "<< key<<" is not in the set"<<endl;
return false;
}
}
void BloomFilter::dump()
{
cout<<m_pockets<<" Pockets: ";
// I changed u to %p because I wanted it printed in hex.
unsigned long i; for(i=0; i< m_pockets; i++) printf("%p ", (void*)m_tickBook[i]);
cout<<endl;
}
void BloomFilter::add(const Key& key)
{
unsigned long h1, h2;
int i1, i2;
int i1_c, i2_c;
// tested!
getHashAndIndicies(h1, h2, i1, i2, key);
getCountIndicies(i1, h1, i2, h2, i1_c, i2_c);
m_tickBook[i1] = m_tickBook[i1] | bitmask(h1);
m_tickBook[i2] = m_tickBook[i2] | bitmask(h2);
m_useCounts[i1_c] = m_useCounts[i1_c] + 1;
m_useCounts[i2_c] = m_useCounts[i2_c] + 1;
countAdd++;
}
void BloomFilter::del(const Key& key)
{
unsigned long h1, h2;
int i1, i2;
int i1_c, i2_c;
if (!exist(key)) throw "You can't delete keys which are not in the bloom filter!";
// First we need the indicies into m_tickBook and the
// hashes.
getHashAndIndicies(h1, h2, i1, i2, key);
// The index of the counter is the index into the bitvector
// times the number of bits per vector item plus the offset into
// that same vector item.
getCountIndicies(i1, h1, i2, h2, i1_c, i2_c);
// We need to update the value in the bitvector in order to
// delete the key.
m_useCounts[i1_c] = (m_useCounts[i1_c] == 1 ? 0 : m_useCounts[i1_c] - 1);
m_useCounts[i2_c] = (m_useCounts[i2_c] == 1 ? 0 : m_useCounts[i2_c] - 1);
// Now, if we depleted the count for a specific bit, then set it to
// zero, by anding the complete unsigned long with the notted bitmask
// of the hash value
if (m_useCounts[i1_c] == 0)
m_tickBook[i1] = m_tickBook[i1] & ~(bitmask(h1));
if (m_useCounts[i2_c] == 0)
m_tickBook[i2] = m_tickBook[i2] & ~(bitmask(h2));
countDelete++;
}
bool BloomFilter::exist(const Key& key)
{
unsigned long h1, h2;
int i1, i2;
countFind++;
getHashAndIndicies(h1, h2, i1, i2, key);
return ((m_tickBook[i1] & bitmask(h1)) > 0) &&
((m_tickBook[i2] & bitmask(h2)) > 0);
}
/*
* Gets the values of the indicies for two hashes and places them in
* the passed parameters. The index is into m_tickBook.
*/
void BloomFilter::getHashAndIndicies(unsigned long& h1, unsigned long& h2, int& i1,
int& i2, const Key& key)
{
h1 = hash1(key);
h2 = hash2(key);
i1 = (int) h1/m_pocketSize;
i2 = (int) h2/m_pocketSize;
}
/*
* Gets the values of the indicies into the count vector, which keeps
* track of how many times a specific bit-position has been used.
*/
void BloomFilter::getCountIndicies(const int i1, const unsigned long h1,
const int i2, const unsigned long h2, int& i1_c, int& i2_c)
{
i1_c = i1*m_pocketSize + h1%m_pocketSize;
i2_c = i2*m_pocketSize + h2%m_pocketSize;
}
** RBST.h **
** RBST.h **
#ifndef RBST_H_
#define RBST_H_
#include <iostream>
#include <assert.h>
#include <vector>
#include <math.h>
#include "Key.h"
#include "DataStructure.h"
#define BUG(str) printf("%s:%d FAILED SIZE INVARIANT: %s\n", __FILE__, __LINE__, str);
using namespace std;
class RBSTNode;
class RBSTNode: public Key
{
public:
RBSTNode(const Key& key):Key(key)
{
m_left =NULL;
m_right = NULL;
m_size = 1U; // the size of one node is 1.
}
virtual ~RBSTNode(){}
string setKey(const Key& key){return Key(key);}
RBSTNode* left(){return m_left; }
RBSTNode* right(){return m_right;}
RBSTNode* setLeft(RBSTNode* left) { m_left = left; return this; }
RBSTNode* setRight(RBSTNode* right) { m_right =right; return this; }
#ifdef DEBUG
ostream& print(ostream& out)
{
out << "Key(" << *this << ", m_size: " << m_size << ")";
return out;
}
#endif
unsigned int size() { return m_size; }
void setSize(unsigned int val)
{
#ifdef DEBUG
this->print(cout);
cout << "::setSize(" << val << ") called." << endl;
#endif
if (val == 0) throw "Cannot set the size below 1, then just delete this node.";
m_size = val;
}
void incSize() {
#ifdef DEBUG
this->print(cout);
cout << "::incSize() called" << endl;
#endif
m_size++;
}
void decrSize()
{
#ifdef DEBUG
this->print(cout);
cout << "::decrSize() called" << endl;
#endif
if (m_size == 1) throw "Cannot decrement size below 1, then just delete this node.";
m_size--;
}
#ifdef DEBUG
unsigned int size(RBSTNode* x);
#endif
private:
RBSTNode(){}
RBSTNode* m_left;
RBSTNode* m_right;
unsigned int m_size;
};
class RBST : public DataStructure
{
public:
RBST() {
m_size = 0;
m_head = NULL;
srand(time(0));
};
virtual ~RBST() {};
/**
* Tries to add key into the tree and will return
* true for a new item added
* false if the key already is in the tree.
*
* Will also have the side-effect of printing to the console if v=true.
*/
bool add(const Key& key, bool v=false);
/**
* Same semantics as other add function, but takes a string,
* but diff name, because that'll cause an ambiguity because of inheritance.
*/
bool addString(const string& key);
/**
* Deletes a key from the tree if that key is in the tree.
* Will return
* true for success and
* false for failure.
*
* Will also have the side-effect of printing to the console if v=true.
*/
bool del(const Key& key, bool v=false);
/**
* Tries to find the key in the tree and will return
* true if the key is in the tree and
* false if the key is not.
*
* Will also have the side-effect of printing to the console if v=true.
*/
bool find(const Key& key, bool v = false);
unsigned int count() { return m_size; }
#ifdef DEBUG
int dump(char sep = ' ');
int dump(RBSTNode* target, char sep);
unsigned int size(RBSTNode* x);
#endif
private:
RBSTNode* randomAdd(RBSTNode* target, const Key& key);
RBSTNode* addRoot(RBSTNode* target, const Key& key);
RBSTNode* rightRotate(RBSTNode* target);
RBSTNode* leftRotate(RBSTNode* target);
RBSTNode* del(RBSTNode* target, const Key& key);
RBSTNode* join(RBSTNode* left, RBSTNode* right);
RBSTNode* find(RBSTNode* target, const Key& key);
RBSTNode* m_head;
unsigned int m_size;
};
#endif /*RBST_H_*/
** RBST.cpp * *
** RBST.cpp **
#include "RBST.h"
bool RBST::add(const Key& key, bool v){
unsigned int oldSize = m_size;
m_head = randomAdd(m_head, key);
if (m_size > oldSize){
if(v) cout<<"Node "<<key<< " is added into the tree."<<endl;
return true;
}else {
if(v) cout<<"Node "<<key<< " is already in the tree."<<endl;
return false;
}
if(v) cout<<endl;
};
bool RBST::addString(const string& key) {
return add(Key(key), false);
}
bool RBST::del(const Key& key, bool v){
unsigned oldSize= m_size;
m_head = del(m_head, key);
if (m_size < oldSize) {
if(v) cout<<"Node "<<key<< " is deleted from the tree."<<endl;
return true;
}
else {
if(v) cout<< "Node "<<key<< " is not in the tree."<<endl;
return false;
}
};
bool RBST::find(const Key& key, bool v){
RBSTNode* ret = find(m_head, key);
if (ret == NULL){
if(v) cout<< "Node "<<key<< " is not in the tree."<<endl;
return false;
}else {
if(v) cout<<"Node "<<key<< " is in the tree."<<endl;
return true;
}
};
#ifdef DEBUG
int RBST::dump(char sep){
int ret = dump(m_head, sep);
cout<<"SIZE: " <<ret<<endl;
return ret;
};
int RBST::dump(RBSTNode* target, char sep){
if (target == NULL) return 0;
int ret = dump(target->left(), sep);
cout<< *target<<sep;
ret ++;
ret += dump(target->right(), sep);
return ret;
};
#endif
/**
* Rotates the tree around target, so that target's left
* is the new root of the tree/subtree and updates the subtree sizes.
*
*(target) b (l) a
* / \ right / \
* a ? ----> ? b
* / \ / \
* ? x x ?
*
*/
RBSTNode* RBST::rightRotate(RBSTNode* target) // private
{
if (target == NULL) throw "Invariant failure, target is null"; // Note: may be removed once tested.
if (target->left() == NULL) throw "You cannot rotate right around a target whose left node is NULL!";
#ifdef DEBUG
cout <<"Right-rotating b-node ";
target->print(cout);
cout << " for a-node ";
target->left()->print(cout);
cout << "." << endl;
#endif
RBSTNode* l = target->left();
int as0 = l->size();
// re-order the sizes
l->setSize( l->size() + (target->right() == NULL ? 0 : target->right()->size()) + 1); // a.size += b.right.size + 1; where b.right may be null.
target->setSize( target->size() -as0 + (l->right() == NULL ? 0 : l->right()->size()) ); // b.size += -a_0_size + x.size where x may be null.
// swap b's left (for a)
target->setLeft(l->right());
// and a's right (for b's left)
l->setRight(target);
#ifdef DEBUG
cout << "A-node size: " << l->size() << ", b-node size: " << target->size() << "." << endl;
#endif
// return the new root, a.
return l;
};
/**
* Like rightRotate, but the other way. See docs for rightRotate(RBSTNode*)
*/
RBSTNode* RBST::leftRotate(RBSTNode* target)
{
if (target == NULL) throw "Invariant failure, target is null";
if (target->right() == NULL) throw "You cannot rotate left around a target whose right node is NULL!";
#ifdef DEBUG
cout <<"Left-rotating a-node ";
target->print(cout);
cout << " for b-node ";
target->right()->print(cout);
cout << "." << endl;
#endif
RBSTNode* r = target->right();
int bs0 = r->size();
// re-roder the sizes
r->setSize(r->size() + (target->left() == NULL ? 0 : target->left()->size()) + 1);
target->setSize(target->size() -bs0 + (r->left() == NULL ? 0 : r->left()->size()));
// swap a's right (for b's left)
target->setRight(r->left());
// swap b's left (for a)
r->setLeft(target);
#ifdef DEBUG
cout << "Left-rotation done: a-node size: " << target->size() << ", b-node size: " << r->size() << "." << endl;
#endif
return r;
};
//
/**
* Adds a key to the tree and returns the new root of the tree.
* If the key already exists doesn't add anything.
* Increments m_size if the key didn't already exist and hence was added.
*
* This function is not called from public methods, it's a helper function.
*/
RBSTNode* RBST::addRoot(RBSTNode* target, const Key& key)
{
countAdd++;
if (target == NULL) return new RBSTNode(key);
#ifdef DEBUG
cout << "addRoot(";
cout.flush();
target->print(cout) << "," << key << ") called." << endl;
#endif
if (*target < key)
{
target->setRight( addRoot(target->right(), key) );
target->incSize(); // Should I?
RBSTNode* res = leftRotate(target);
#ifdef DEBUG
if (target->size() != size(target))
BUG("in addRoot 1");
#endif
return res;
}
target->setLeft( addRoot(target->left(), key) );
target->incSize(); // Should I?
RBSTNode* res = rightRotate(target);
#ifdef DEBUG
if (target->size() != size(target))
BUG("in addRoot 2");
#endif
return res;
};
/**
* This function is called from the public add(key) function,
* and returns the new root node.
*/
RBSTNode* RBST::randomAdd(RBSTNode* target, const Key& key)
{
countAdd++;
if (target == NULL)
{
m_size++;
return new RBSTNode(key);
}
#ifdef DEBUG
cout << "randomAdd(";
target->print(cout) << ", \"" << key << "\") called." << endl;
#endif
int r = (rand() % target->size()) + 1;
// here is where we add the target as root!
if (r == 1)
{
m_size++; // TODO: Need to lock.
return addRoot(target, key);
}
#ifdef DEBUG
printf("randomAdd recursion part, ");
#endif
// otherwise, continue recursing!
if (*target <= key)
{
#ifdef DEBUG
printf("target <= key\n");
#endif
target->setRight( randomAdd(target->right(), key) );
target->incSize(); // TODO: Need to lock.
#ifdef DEBUG
if (target->right()->size() != size(target->right()))
BUG("in randomAdd 1");
#endif
}
else
{
#ifdef DEBUG
printf("target > key\n");
#endif
target->setLeft( randomAdd(target->left(), key) );
target->incSize(); // TODO: Need to lock.
#ifdef DEBUG
if (target->left()->size() != size(target->left()))
BUG("in randomAdd 2");
#endif
}
#ifdef DEBUG
printf("randomAdd return part\n");
#endif
m_size++; // TODO: Need to lock.
return target;
};
/////////////////////////////////////////////////////////////
///////////////////// DEL FUNCTIONS ////////////////////////
/////////////////////////////////////////////////////////////
/**
* Deletes a node with the passed key.
* Returns the root node.
* Decrements m_size if something was deleted.
*/
RBSTNode* RBST::del(RBSTNode* target, const Key& key)
{
countDelete++;
if (target == NULL) return NULL;
#ifdef DEBUG
cout << "del(";
target->print(cout) << ", \"" << key << "\") called." << endl;
#endif
RBSTNode* ret = NULL;
// found the node to delete
if (*target == key)
{
ret = join(target->left(), target->right());
m_size--;
delete target;
return ret; // return the newly built joined subtree!
}
// store a temporary size before recursive deletion.
unsigned int size = m_size;
if (*target < key) target->setRight( del(target->right(), key) );
else target->setLeft( del(target->left(), key) );
// if the previous recursion changed the size, we need to decrement the size of this target too.
if (m_size < size) target->decrSize();
#ifdef DEBUG
if (RBST::size(target) != target->size())
BUG("in del");
#endif
return target;
};
/**
* Joins the two subtrees represented by left and right
* by randomly choosing which to make the root, weighted on the
* size of the sub-tree.
*/
RBSTNode* RBST::join(RBSTNode* left, RBSTNode* right)
{
if (left == NULL) return right;
if (right == NULL) return left;
#ifdef DEBUG
cout << "join(";
left->print(cout);
cout << ",";
right->print(cout) << ") called." << endl;
#endif
// Find the chance that we use the left tree, based on its size over the total tree size.
// 3 s.d. randomness :-p e.g. 60.3% chance.
bool useLeft = ((rand()%1000) < (signed)((float)left->size()/(float)(left->size() + right->size()) * 1000.0));
RBSTNode* subtree = NULL;
if (useLeft)
{
subtree = join(left->right(), right);
left->setRight(subtree)
->setSize((left->left() == NULL ? 0 : left->left()->size())
+ subtree->size() + 1 );
#ifdef DEBUG
if (size(left) != left->size())
BUG("in join 1");
#endif
return left;
}
subtree = join(right->left(), left);
right->setLeft(subtree)
->setSize((right->right() == NULL ? 0 : right->right()->size())
+ subtree->size() + 1);
#ifdef DEBUG
if (size(right) != right->size())
BUG("in join 2");
#endif
return right;
};
/////////////////////////////////////////////////////////////
///////////////////// FIND FUNCTIONS ///////////////////////
/////////////////////////////////////////////////////////////
/**
* Tries to find the key in the tree starting
* search from target.
*
* Returns NULL if it was not found.
*/
RBSTNode* RBST::find(RBSTNode* target, const Key& key)
{
countFind++; // Could use private method only counting the first call.
if (target == NULL) return NULL; // not found.
if (*target == key) return target; // found (does string override ==?)
if (*target < key) return find(target->right(), key); // search for gt to the right.
return find(target->left(), key); // search for lt to the left.
};
#ifdef DEBUG
unsigned int RBST::size(RBSTNode* x)
{
if (x == NULL) return 0;
return 1 + size(x->left()) + size(x->right());
}
#endif
I’ll save the SkipList for another time since it’s already possible to find good implementations of a SkipList from the links and my version wasn’t much different.
I'll save the SkipList for another time since it's already possible to find good implementations of a SkipList from the links and my version wasn't much different.
The graphs generated from the test-file are as follows:
The graphs generated from the test-file are as follows:
Graph showing time taken to add new items for BloomFilter, RBST and SkipList.
graph http://haf.se/content/dl/addtimer.png
Graph showing time taken to find items for BloomFilter, RBST and SkipList
graph http://haf.se/content/dl/findtimer.png
Graph showing time taken to delete items for BloomFilter, RBST and SkipList
graph http://haf.se/content/dl/deltimer.png
So as you can see, the random binary search tree was rather a lot better than the SkipList. The bloom filter lives up to its O(k).
So as you can see, the random binary search tree was rather a lot better than the SkipList. The bloom filter lives up to its O(k).
这篇关于.net集合,用于快速插入/删除的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!