快速小端在ASM大端转换 [英] Fast little-endian to big-endian conversion in ASM

查看:378
本文介绍了快速小端在ASM大端转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有UINT类型在C#中的数组,检查是否该程序正在一个小端机器上后,我想将数据转换为大端类型。由于数据量可能会变得非常大,但总是偶数,我就在想考虑两个UINT类型为ULONG类型,在ASM更好的性能和节目,所以我正在寻找一个非常快(最快可能的话)汇编算法转换成小端的大端。

I have an array of uint-types in C#, After checking if the program is working on a little-endian machine, I want to convert the data to a big-endian type. Because the amount of data can become very large but is always even, I was thinking to consider two uint types as an ulong type, for a better performance and program it in ASM, so I am searching for a very fast (the fastest if possible) Assembler-algorithm to convert little-endian in big-endian.

推荐答案

对于大量的数据,在 BSWAP 指令(在Visual C ++的下的 _byteswap_ushort _byteswap_ulong _byteswap_uint64 内在)是要走的路。这甚至将超越手写组件。这些都不是纯C#没有的P / Invoke用,因此:

For a large amount of data, the bswap instruction (available in Visual C++ under the _byteswap_ushort, _byteswap_ulong, and _byteswap_uint64 intrinsics) is the way to go. This will even outperform handwritten assembly. These are not available in pure C# without P/Invoke, so:

  1. 在只使用这个,如果你的很多的数据的字节交换。
  2. 您应该认真考虑编写最低水平的应用程序的I / O管理C ++这样你就可以永远使数据到管理的阵列之前,做你的交换。您已经写了一个C ++库,所以没有多大的损失,你避开所有的P / Invoke相关的性能对大数据集经营低复杂度的算法问题。
  1. Only use this if you have a lot of data to byte swap.
  2. You should seriously consider writing your lowest level application I/O in managed C++ so you can do your swapping before ever bringing the data into a managed array. You already have to write a C++ library, so there's not much to lose and you sidestep all the P/Invoke-related performance issues for low-complexity algorithms operating on large datasets.

PS:很多人不知道的字节交换内在的。他们的表现是惊人的,加倍所以对于浮点数据,因为它处理它们为整数。有没有办法战胜它没有手动编码的寄存器加载的每一个字节交换使用情况,并且你应该尝试,你可能会招致更大的打击在优化比你曾经接了。

PS: Many people are unaware of the byte swap intrinsics. Their performance is astonishing, doubly so for floating point data because it processes them as integers. There is no way to beat it without hand coding your register loads for every single byte swap use case, and should you try that, you'll probably incur a bigger hit in the optimizer than you'll ever pick up.

这篇关于快速小端在ASM大端转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆