2D morton 代码编码/解码 64 位 [英] 2D morton code encode/decode 64bits

查看:41
本文介绍了2D morton 代码编码/解码 64 位的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何将给定的 [x, y] 的 morton 代码(z 顺序)编码/解码为 32 位无符号整数,生成 64 位 morton 代码,反之亦然?我确实有 xy2d 和 d2xy 但仅适用于 16 位宽产生 32 位莫顿数的坐标.在网上找了很多,都没有找到.请帮忙.

How to encode/decode morton codes(z-order) given [x, y] as 32bit unsigned integers producing 64bit morton code, and vice verse ? I do have xy2d and d2xy but only for coordinates that are 16bits wide producing 32bit morton number. Searched a lot in net, but couldn't find. Please help.

推荐答案

如果您可以使用特定于体系结构的指令,那么您很可能能够将操作加速到超出使用 bit-twiddeling hacks 所能达到的速度:

If it is possible for you to use architecture specific instructions you'll likely be able to accelerate the operation beyond what is possible using bit-twiddeling hacks:

例如,如果您为 Intel Haswell 和更高版本的 CPU 编写代码,您可以使用包含 pextpdep 指令的 BMI2 指令集.这些(以及其他很棒的东西)可用于构建您的函数.

For example if you write code for the Intel Haswell and later CPUs you can use the BMI2 instruction set which contains the pext and pdep instructions. These can (among other great things) be used to build your functions.

这是一个完整的例子(用 GCC 测试):

Here is a complete example (tested with GCC):

#include <immintrin.h>
#include <stdint.h>

// on GCC, compile with option -mbmi2, requires Haswell or better.

uint64_t xy_to_morton(uint32_t x, uint32_t y)
{
  return _pdep_u32(x, 0x55555555) | _pdep_u32(y,0xaaaaaaaa);
}

void morton_to_xy(uint64_t m, uint32_t *x, uint32_t *y)
{
  *x = _pext_u64(m, 0x5555555555555555);
  *y = _pext_u64(m, 0xaaaaaaaaaaaaaaaa);
}

如果您必须支持较早的 CPU 或 ARM 平台,则并非全部丢失.您仍然可以从特定于密码学的说明中至少获得有关 xy_to_morton 函数的帮助.

If you have to support earlier CPUs or the ARM platform not all is lost. You may still get at least get help for the xy_to_morton function from instructions specific for cryptography.

如今,许多 CPU 都支持无进位乘法.在 ARM 上,这将是来自 NEON 指令集的 vmul_p8.在 X86 上,您会从 CLMUL 指令集(自 2010 年起可用)中找到它作为 PCLMULQDQ.

A lot of CPUs have support for carry-less multiplication these days. On ARM that'll be vmul_p8 from the NEON instruction set. On X86 you'll find it as PCLMULQDQ from the CLMUL instruction set (available since 2010).

这里的技巧是,一个数字与自身的无进位乘法将返回一个位模式,其中包含零位交错的参数的原始位.所以它与上面显示的 _pdep_u32(x,0x55555555) 相同.例如.它变成以下字节:

The trick here is, that a carry-less multiplication of a number with itself will return a bit-pattern that contains the original bits of the argument with zero-bits interleaved. So it is identical to the _pdep_u32(x,0x55555555) shown above. E.g. it turns the following byte:

 +----+----+----+----+----+----+----+----+
 | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
 +----+----+----+----+----+----+----+----+

进入:

 +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
 | 0  | b7 | 0  | b6 | 0  | b5 | 0  | b4 | 0  | b3 | 0  | b2 | 0  | b1 | 0  | b0 |
 +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+

现在您可以构建 xy_to_morton 函数(此处显示的是 CLMUL 指令集):

Now you can build the xy_to_morton function as (here shown for CLMUL instruction set):

#include <wmmintrin.h>
#include <stdint.h>

// on GCC, compile with option -mpclmul

uint64_t carryless_square (uint32_t x)
{
  uint64_t val[2] = {x, 0};
  __m128i *a = (__m128i * )val;
  *a = _mm_clmulepi64_si128 (*a,*a,0);
  return val[0];
}

uint64_t xy_to_morton (uint32_t x, uint32_t y)
{
  return carryless_square(x)|(carryless_square(y) <<1);
}

_mm_clmulepi64_si128 生成一个 128 位的结果,我们只使用低 64 位.因此,您甚至可以改进上面的版本并使用单个 _mm_clmulepi64_si128 来完成这项工作.

_mm_clmulepi64_si128 generates a 128 bit result of which we only use the lower 64 bits. So you can even improve upon the version above and use a single _mm_clmulepi64_si128 do do the job.

这与主流平台(例如带有 NEON 和 x86 的现代 ARM)一样好.不幸的是,我不知道使用加密指令来加速 morton_to_xy 函数的任何技巧,而且我真的很努力地尝试了几个月.

That is as good as you can get on mainstream platforms (e.g. modern ARM with NEON and x86). Unfortunately I don't know of any trick to speed up the morton_to_xy function using the cryptography instructions and I tried really hard for several month.

这篇关于2D morton 代码编码/解码 64 位的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆