C ++：将字节转换为unsigned int的最快方式 [英] C++: fastest way to convert byte to unsigned int

查看：233 发布时间：2016/10/24 16:30:22 c++ performance byte memory-bandwidth

本文介绍了C ++：将字节转换为unsigned int的最快方式的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个字节数组（ unsigned char * ），必须转换为整数。整数以三个字节表示。这是我做的

I have an array of bytes(unsigned char *) that must be converted to integer. Integer are represented over three bytes. This is what I have done

//bytes array is allocated and filled
//allocating space for intBuffer (uint32_t)
unsigned long i = 0;
uint32_t number;
for(; i<size_tot; i+=3){
    uint32_t number = (bytes[i]<<16) | (bytes[i+1]<<8) | bytes[i+2];
    intBuffer[number]++;
}

这段代码的作业效果很好，内存中的三次访问（特别是对于 size_tot 的大值，按照 3000000 的顺序）。

This piece of code does its jobs well but it is incredibly slow due to the three accesses in memory(expecially for big values of size_tot, in the order of 3000000). Is there a way to do it faster and increase performance?

推荐答案

正确的答案几乎总是：

写入正确的代码，启用优化，信任您的编译器。

Write correct code, enable optimisations, trust your compiler.

$ b

given:

void count_values(std::array<uint32_t, 256^3>& results, const unsigned char* from, const unsigned char* to) { for(; from != to; from = std::next(from, 3)) { ++results[(*from << 16) | (*std::next(from, 1) << 8) | *(std::next(from,2))]; } }

收益率（内嵌的解释性评论）：

Yields (explanatory comments inlined):

__Z12count_valuesRNSt3__15arrayIjLm259EEEPKhS4_: ## @_Z12count_valuesRNSt3__15arrayIjLm259EEEPKhS4_ .cfi_startproc ## BB#0: pushq %rbp Ltmp0: .cfi_def_cfa_offset 16 Ltmp1: .cfi_offset %rbp, -16 movq %rsp, %rbp Ltmp2: .cfi_def_cfa_register %rbp jmp LBB0_2 .align 4, 0x90 LBB0_1: ## %.lr.ph ## in Loop: Header=BB0_2 Depth=1 # dereference from and extend the 8-bit value to 32 bits movzbl (%rsi), %eax shlq $16, %rax # shift left 16 movzbl 1(%rsi), %ecx # dereference *(from+1) and extend to 32bits by padding with zeros shlq $8, %rcx # shift left 8 orq %rax, %rcx # or into above result movzbl 2(%rsi), %eax # dreference *(from+2) and extend to 32bits orq %rcx, %rax # or into above result incl (%rdi,%rax,4) # increment the correct counter addq $3, %rsi # from += 3 LBB0_2: ## %.lr.ph ## =>This Inner Loop Header: Depth=1 cmpq %rdx, %rsi # while from != to jne LBB0_1 ## BB#3: ## %._crit_edge popq %rbp retq .cfi_endproc

注意，没有必要偏离标准结构或标准调用。编译器生成完美的代码。

Notice that there is no need to stray away from standard constructs or standard calls. The compiler produces perfect code.

为了进一步证明这一点，让我们疯狂，写一个自定义的迭代器，让我们将函数减少为：

To further prove the point, let's go crazy and write a custom iterator that allows us to reduce the function to this:

void count_values(std::array<uint32_t, 256^3>& results, byte_triple_iterator from, byte_triple_iterator to) { assert(iterators_correct(from, to)); while(from != to) { ++results[*from++]; } }

这里是iterator：

And here is a (basic) implementation of such an iterator:

struct byte_triple_iterator { constexpr byte_triple_iterator(const std::uint8_t* p) : _ptr(p) {} std::uint32_t operator*() const noexcept { return (*_ptr << 16) | (*std::next(_ptr, 1) << 8) | *(std::next(_ptr,2)); } byte_triple_iterator& operator++() noexcept { _ptr = std::next(_ptr, 3); return *this; } byte_triple_iterator operator++(int) noexcept { auto copy = *this; _ptr = std::next(_ptr, 3); return copy; } constexpr const std::uint8_t* byte_ptr() const { return _ptr; } private: friend bool operator<(const byte_triple_iterator& from, const byte_triple_iterator& to) { return from._ptr < to._ptr; } friend bool operator==(const byte_triple_iterator& from, const byte_triple_iterator& to) { return from._ptr == to._ptr; } friend bool operator!=(const byte_triple_iterator& from, const byte_triple_iterator& to) { return not(from == to); } friend std::ptrdiff_t byte_difference(const byte_triple_iterator& from, const byte_triple_iterator& to) { return to._ptr - from._ptr; } const std::uint8_t* _ptr; }; bool iterators_correct(const byte_triple_iterator& from, const byte_triple_iterator& to) { if (not(from < to)) return false; auto dist = to.byte_ptr() - from.byte_ptr(); return dist % 3 == 0; }

现在我们有什么？

的断言来检查我们的源确实是完全正确的长度（调试版本）

这是保证是一个输出结构right size

但是对我们的目标代码做了什么呢？（编译 -O3 -DNDEBUG ）

But what's it done to our object code? (compile with -O3 -DNDEBUG)

.globl __Z12count_valuesRNSt3__15arrayIjLm259EEE20byte_triple_iteratorS3_ .align 4, 0x90 __Z12count_valuesRNSt3__15arrayIjLm259EEE20byte_triple_iteratorS3_: ## @_Z12count_valuesRNSt3__15arrayIjLm259EEE20byte_triple_iteratorS3_ .cfi_startproc ## BB#0: pushq %rbp Ltmp3: .cfi_def_cfa_offset 16 Ltmp4: .cfi_offset %rbp, -16 movq %rsp, %rbp Ltmp5: .cfi_def_cfa_register %rbp jmp LBB1_2 .align 4, 0x90 LBB1_1: ## %.lr.ph ## in Loop: Header=BB1_2 Depth=1 movzbl (%rsi), %eax shlq $16, %rax movzbl 1(%rsi), %ecx shlq $8, %rcx orq %rax, %rcx movzbl 2(%rsi), %eax orq %rcx, %rax incl (%rdi,%rax,4) addq $3, %rsi LBB1_2: ## %.lr.ph ## =>This Inner Loop Header: Depth=1 cmpq %rdx, %rsi jne LBB1_1 ## BB#3: ## %._crit_edge popq %rbp retq .cfi_endproc

- 效率同样高。

Answer: nothing - it's just as efficient.

课程？没有真的！相信您的编译器！

The lesson? No really! Trust your compiler!!!

这篇关于C ++：将字节转换为unsigned int的最快方式的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

C ++：将字节转换为unsigned int的最快方式 [英] C++: fastest way to convert byte to unsigned int

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

C ++：将字节转换为unsigned int的最快方式 [英] C++: fastest way to convert byte to unsigned int

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭