_mm_ 类型函数的等效 C 代码 [英] Equivalent C code for _mm_ type functions
问题描述
克服_mm_store_ps
、_mm_add_ps
等__函数的简单等效C代码是什么.请通过具有等效C代码的示例指定任何函数.>
为什么使用这些函数?
根据您之前的类似问题,您似乎在尝试解决错误的问题.您有一些用于人脸检测的现有 SSE 代码,该代码正在崩溃,因为您将未对齐的数据传递给需要 16 字节对齐数据的 SSE 例程.在之前的问题中,人们曾告诉过您如何修复这种错位(在 Windows 上使用 _mm_malloc,或在 Linux 上使用 memalign/posix_memalign),但您似乎忽略了这个建议,而是错误地假设您需要重新编写所有 SSE 代码.花一些时间了解什么是 SSE、您的 SSE 代码如何工作、为什么需要 16 字节对齐以及如何实现这一点.只要您解决了数据错位问题,您现有的 SSE 代码应该可以在 Windows 或 Linux 上正常运行,一旦您了解自己在做什么,这应该是一项相对简单的任务.
What is the simple equivalent C code to overcome __ functions like _mm_store_ps
, _mm_add_ps
, etc. Please specify anyone function through an example with the equivalent C code.
Why are these functions used?
Based on your previous similar questions it sounds like you're trying to solve the wrong problem. You have some existing SSE code for face detection which is crashing because you are passing misaligned data to SSE routines that require 16 byte aligned data. In previous questions people have told you how to fix this misalignment (use _mm_malloc on Windows, or memalign/posix_memalign on Linux) but you seem to be ignoring this advice and instead you are wrongly assuming that you need to re-write all the SSE code. Take some time to understand what SSE is, how your SSE code works, why it needs 16 byte alignment and how to achieve this. Your existing SSE code should run fine on either Windows or Linux so long as you fix your data misalignment problem, which should be a relatively simple task once you understand what you are doing.
这篇关于_mm_ 类型函数的等效 C 代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!