java:稀疏位向量 [英] java: sparse bit vector

查看:207
本文介绍了java:稀疏位向量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Java中是否存在用于稀疏位向量的任何着名库?

Are there any well-known libraries in Java for sparse bit vectors?

(并且有关于稀疏对于如何使用它们的指导原则 java.util.BitSet ?)

(And are there guidelines for how sparse is useful to use them vs. java.util.BitSet?)

推荐答案

colt library 具有稀疏矩阵(1D,2D和3D)。它还有一个高效的BitVector,每个值1位,而不是8位,因为 boolean []

The colt library has sparse matrices (1D, 2D and 3D). It also has an efficient BitVector, with 1 bit per value, rather than 8-bits as boolean[] does.

然而,稀疏矩阵不直接支持位 - 只有双精度和对象。您可以通过maping位索引将1D稀疏双矩阵包装到长索引(bitIndex>>> 6),因为每个long保持64位,将检索到的double转换为原始long值,并使用位操作来访问检索到的long的位。一点工作,但远不及自己实现稀疏矢量那么多。一旦你的包装器工作,你可能会避免将双精度转换为long,并使用双重1D稀疏矩阵的可用Colt源代码作为起点实现一个真正的稀疏长1d矩阵。

However, the sparse matrices do not support bits directly - only doubles and objects. You could wrap the 1D sparse double matrix by maping bit index to long indices (bitIndex>>6) since each long holds 64 bits, convert the retrieved double to a raw long value, and use bit manipulation to access the bits of the retrieved long. A little work, but nowhere near as much as implementing the sparse vector yourself. Once your wrapper is working, you might avoid converting doubles to longs, and implement a real sparse long 1d matrix using the available Colt source code for the double 1D sparse matrix as a starting point.

编辑:更多信息。假设所有位(长整数)最初为0,Colt向量/矩阵最初不需要存储器用于存储。将值设置为非零会占用内存。将值设置回0会继续消耗内存,但会定期回收零值的内存。

More info. The Colt vectors/matrices require no memory initially for storage, assuming all bits (longs) are initially 0. Setting a value to non-zero consumes memory. Setting the value back to 0 continues to consume memory, although memory for zero values is reclaimed periodically.

如果这些位真的是稀疏的,这样每个后备long值只有一个位设置,那么存储开销将非常差,每个实际位需要64位存储。但是当你提到典型的情况是20-40%稀疏时,那么开销将会低得多,如果比特在范围内聚集,可能没有浪费的存储,例如, 0-100,然后1000-1100和2000-2200(十六进制值)中的位。总的来说,只有1/16的区域被分配给位,但是聚类意味着存储的位没有浪费的空间。

If the bits are truly sparse, such that each backing long value only has one bit set, then the storage overhead will be very poor, requiring 64-bits per actual bit stored. But as you mention typical case is 20-40% sparse, then the overhead will be much lower, with possibly no wasted storage if bits are clustered in ranges, e.g. bits from 0-100, then 1000-1100, and 2000-2200 (values in hex.) Overall, only 1/16 of the region is assigned to bits, but the clustering means that the bits are stored with no wasted space.

这篇关于java:稀疏位向量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆