最快的大集合的集合与同一域进行子集测试的操作方式 [英] Fastest way to perform subset test operation on a large collection of sets with same domain

查看:139
本文介绍了最快的大集合的集合与同一域进行子集测试的操作方式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有地方存放套万亿美元。对于每一组的域名是相同的。它也是有限和离散。所以每个组可以被存储作为一个比特字段:(:1024例如)一个相对短的长度(例如0000100111 ...)。即,在该位字段比特X指示是否项X(1024可能项目)被包括在给定组或不

Assume we have trillions of sets stored somewhere. The domain for each of these sets is the same. It is also finite and discrete. So each set may be stored as a bit field (eg: 0000100111...) of a relatively short length (eg: 1024). That is, bit X in the bitfield indicates whether item X (of 1024 possible items) is included in the given set or not.

现在,我想设计一个存储结构和算法能够有效地回答查询:哪些数据存储区设置设置y为一个子集。设置Ÿ本身并不present在数据存储,并在运行时被指定。

Now, I want to devise a storage structure and an algorithm to efficiently answer the query: what sets in the data store have set Y as a subset. Set Y itself is not present in the data store and is specified at run time.

现在来解决这将是功能最简单的方式设定的Y位域与每一个在数据存储逐个设置位字段,采摘其与结果相符个Y位域的人。

Now the simplest way to solve this would be to AND the bitfield for set Y with bit fields of every set in the data store one by one, picking the ones whose AND result matches Y's bitfield.

我如何可以加快这个吗?是否有一个树状结构(索引)或一些聪明的算法,让我来,而不必每存储组的位域执行此查询?

How can I speed this up? Is there a tree structure (index) or some smart algorithm that would allow me to perform this query without having to AND every stored set's bitfield?

有没有已经支持这些行动对套大集合数据库?

Are there databases that already support such operations on large collections of sets?

推荐答案

如果你可以preprocess集,该子集关系重新presentable为DAG(因为你所描述的一个偏序)。如果传递的还原计算,那么我认为你能避免只进行DFS从最大的集启动和停止时Y是不再是当前的一组被访问的一个子集测试所有集。

If you can preprocess the sets, the subset relation is representable as a DAG (because you're describing a poset). If the transitive reduction is computed, then I think you can avoid testing all the sets by just performing a DFS starting from the biggest sets and stopping whenever Y is no longer a subset of the current set being visited.

这篇关于最快的大集合的集合与同一域进行子集测试的操作方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆