Hamming多维数据集的顶点上的查询点 [英] Query points on the vertices of a Hamming cube

查看:128
本文介绍了Hamming多维数据集的顶点上的查询点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有N个点,只存在于维度D的立方体的顶点上,其中D类似于3。

I have N points that lie only on the vertices of a cube, of dimension D, where D is something like 3.

顶点可能不包含任何点。所以每个点都有{0,1} D 中的坐标。只要查询时间只对内存成本感兴趣(例如N中不是指数)。

A vertex may not contain any point. So every point has coordinates in {0, 1}D. I am only interested in query time, as long as the memory cost is reasonable ( not exponential in N for example :) ).

给定位于多维数据集顶点之一的查询和输入参数 r ,找到所有具有汉明距离的顶点(因此,点) = r 与查询。

Given a query that lies on one of the cube's vertices and an input parameter r, find all the vertices (thus points) that have hamming distance <= r with the query.

有什么方法去一个 c ++ environment?

What's the way to go in a c++ environment?

我正在想一个kd树,但我不确定,需要帮助,任何输入,甚至近似值,将不胜感激!由于汉明距离发挥作用,按位操作应该有所帮助(例如XOR)。

I am thinking of a kd-tree, but I am not sure and want help, any input, even approximative, would be appreciated! Since hamming distance comes into play, bitwise manipulations should help (e.g. XOR).

推荐答案

一个很好的bithack从一个位掩码,以 k 位设置为词典下一个排列,这意味着循环使用 k 位设置的所有掩码是相当简单的。使用初始值对这些蒙版进行异或,使汉明距离的所有值与 k 远离。

There is a nice bithack to go from one bitmask with k bits set to the lexicographically next permutation, which means it's fairly simple to loop through all masks with k bits set. XORing these masks with an initial value gives all the values at hamming distance exactly k away from it.

所以对于 D 尺寸,其中 D 小于32(否则更改类型),

So for D dimensions, where D is less than 32 (otherwise change the types),

uint32_t limit = (1u << D) - 1;
for (int k = 1; k <= r; k++) {
    uint32_t diff = (1u << k) - 1;
    while (diff <= limit) {
        // v is the input vertex
        uint32_t vertex = v ^ diff;
        // use it
        diff = nextBitPermutation(diff);
    }
}

其中 nextBitPermutation 可以用C ++实现,如(如果你有 __ builtin_ctz

Where nextBitPermutation may be implemented in C++ as something like (if you have __builtin_ctz)

uint32_t nextBitPermutation(uint32_t v) {
    // see https://graphics.stanford.edu/~seander/bithacks.html#NextBitPermutation
    uint32_t t = v | (v - 1);
    return (t + 1) | (((~t & -~t) - 1) >> (__builtin_ctz(v) + 1));
}

或MSVC(未测试)

uint32_t nextBitPermutation(uint32_t v) {
    // see https://graphics.stanford.edu/~seander/bithacks.html#NextBitPermutation
    uint32_t t = v | (v - 1);
    unsigned long tzc;
    _BitScanForward(&tzc, v); // v != 0 so the return value doesn't matter
    return (t + 1) | (((~t & -~t) - 1) >> (tzc + 1));
}






如果D是>真的低,4或更低,旧的 popcnt -with - pshufb 工作真的很好,一般一切都很顺利,如下所示:


If D is really low, 4 or lower, the old popcnt-with-pshufb works really well and generally everything just lines up well, like this:

uint16_t query(int vertex, int r, int8_t* validmask)
{
    // validmask should be array of 16 int8_t's,
    // 0 for a vertex that doesn't exist, -1 if it does
    __m128i valid = _mm_loadu_si128((__m128i*)validmask);
    __m128i t0 = _mm_set1_epi8(vertex);
    __m128i r0 = _mm_set1_epi8(r + 1);
    __m128i all =        _mm_setr_epi8(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15);
    __m128i popcnt_lut = _mm_setr_epi8(0, 1, 1, 2, 1, 2, 2, 3, 1, 2,  2,  3,  2,  3,  3,  4);
    __m128i dist = _mm_shuffle_epi8(popcnt_lut, _mm_xor_si128(t0, all));
    __m128i close_enough = _mm_cmpgt_epi8(r0, dist);
    __m128i result = _mm_and_si128(close_enough, valid);
    return _mm_movemask_epi8(result);
}

这应该是相当快的;与上面的bithack相比( nextBitPermutation ,相当重,在那里使用很多),并且比较了所有顶点的循环,并测试它们是否在范围内与内置的 popcnt ,自动占用至少16个周期,并且上述不应该假定一切都被缓存,甚至永久地在一个寄存器中)。结果是令人厌烦的工作,因为它是一个掩码,哪些顶点都存在和在查询点的范围,而不是它们的列表。它会很好地结合对与点相关的数据进行一些处理。

This should be fairly fast; fast compared to the bithack above (nextBitPermutation, which is fairly heavy, is used a lot there) and also compared to looping over all vertices and testing whether they are in range (even with builtin popcnt, that automatically takes at least 16 cycles and the above shouldn't, assuming everything is cached or even permanently in a register). The downside is the result is annoying to work with, since it's a mask of which vertices both exist and are in range of the queried point, not a list of them. It would combine well with doing some processing on data associated with the points though.

这也缩小到D = 3当然,只是没有点> = 8有效。 D> 4可以类似地完成,但是它需要更多的代码,因为这真的是一个强力解决方案,由于并行性只是快速,它从根本上慢慢地在D中慢下来。

This also scales down to D=3 of course, just make none of the points >= 8 valid. D>4 can be done similarly but it takes more code then, and since this is really a brute force solution that is only fast due to parallelism it fundamentally gets slower exponentially in D.

这篇关于Hamming多维数据集的顶点上的查询点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆