未知错误GCC,在编制ARM NEON(严重) [英] Unknown GCC error, while compiling for ARM NEON (Critical)

查看:1421
本文介绍了未知错误GCC,在编制ARM NEON(严重)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个基于ARM NEON Cortex-A8处理器的目标。我被利用NEON的优化我的code。但是,当我编译我的code我得到这个奇怪的错误。不知道如何解决这个问题。

我试图编译我的主机使用code的Sourcery(PART2)以下的code(第1部分)。
而我得到这个奇怪的错误(PART3)。难道我做错了什么吗?任何人都可以编译这个,看看他们是否也得到同样的编译错误?

奇怪的部分是,在code,如果我注释掉否则,如果(STEP_SIZE == 4)的code的一部分,那么,错误消失。但遗憾的是我的优化不出来它完成的,所以我一定要拥有它。

起初我以为它与codeSourcey编译器(我的主机上)的问题,所以我直接编译在我的目标程序(我的目标在Ubuntu上运行)。我用gcc那里,我再次得到了同样的错误,当我注释掉如果(STEP_SIZE == 4)部分,则误差消失东西。

帮助!


第1部分

 #包括LT&;&stdio.h中GT;
#包括arm_neon.h#定义IMAGE_HEIGHT 480
#定义IMAGE_WIDTH 640float32_t integral_image [IMAGE_HEIGHT] [IMAGE_WIDTH];float32x4_t box_area_compute3(INT,INT,INT,INT,无符号整型,浮点);内联INT分钟(INT,INT);诠释的main()
{ box_area_compute3(1,1,4,4,2,0); 返回0;
}float32x4_t box_area_compute3(INT行,诠释山口,诠释NUM_ROWS,诠释NUM_COLS,无符号整型STEP_SIZE,浮动三级)
{
 unsigned int类型高度= IMAGE_HEIGHT;
 unsigned int类型宽度= IMAGE_WIDTH; INT temp_row =行+ NUM_ROWS;
 INT temp_col =山坳+ NUM_COLS; INT R1 =(分(行,高度)) - 1;
 INT R2 =(分钟(temp_row,高度)) - 1; INT C1 =(分(COL,宽度)) - 1;
 INT C2 =(分(temp_col,宽度)) - 1; float32x4_t v128_areas; 如果(STEP_SIZE == 2)
 {
  float32x4x2_t TOP_LEFT,TOP_RIGHT,BOTTOM_LEFT,BOTTOM_RIGHT;
  TOP_LEFT = vld2q_f32((float32_t *)integral_image [R1] + C1);
  TOP_RIGHT = vld2q_f32((float32_t *)integral_image [R1] + C2);
  BOTTOM_LEFT = vld2q_f32((float32_t *)integral_image [R2] + C1);
  BOTTOM_RIGHT = vld2q_f32((float32_t *)integral_image [R2] + C2);  v128_areas = vsubq_f32(vsubq_f32(vaddq_f32(top_left.val [0],bottom_right.val [0]),top_right.val [0]),bottom_left.val [0]);
 }
 否则,如果(STEP_SIZE == 4)
 {
  float32x4x4_t TOP_LEFT,TOP_RIGHT,BOTTOM_LEFT,BOTTOM_RIGHT;
  TOP_LEFT = vld4q_f32((float32_t *)integral_image [R1] + C1);
  TOP_RIGHT = vld4q_f32((float32_t *)integral_image [R1] + C2);
  BOTTOM_LEFT = vld4q_f32((float32_t *)integral_image [R2] + C1);
  BOTTOM_RIGHT = vld4q_f32((float32_t *)integral_image [R2] + C2);  v128_areas = vsubq_f32(vsubq_f32(vaddq_f32(top_left.val [0],bottom_right.val [0]),top_right.val [0]),bottom_left.val [0]); } 如果(三级== 3.0)
  v128_areas = vmulq_n_f32(v128_areas三个); 返回v128_areas;}内联INT分钟(INT X,int y)对
{
 回报(X LT,Y X:Y);
}


第2部分

  ARM-NONE-Linux的gnueabi-gcc的-O0 -g3 -Wall -c -fmessage长度= 0 -fcommon -MMD -MP -MFmain.d-MT main.d-mcpu = Cortex-A8的-marm -mfloat-ABI =硬-mfpu =霓虹灯vfpv4 -omain.o中../main.c


PART 3

  ../ main.c中:在函数'box_area_compute3:
../main.c:65:错误:无法找到一个寄存器类溢出GENERAL_REGS
../main.c:65:错误:这是一个insn:
(insn的226 225 227 5 C:\\ Program Files文件\\ codesourcery \\的Sourcery G ++ \\ BIN \\ LIB ../ / GCC / ARM-NONE-Linux的gnueabi / 4.4.1 /有/ arm_neon.h:9863(水货[
           (集(REG:西安148 D.17028])
               (UNSPEC:十一[
                       (MEM:XI(REG:SI 3 R3 [301])[0 S64 A64])
                       (REG:西安148 D.17028])
                       (UNSPEC:V4SF [
                               (CONST_INT 0为0x0])
                           ] 191)
                   ] 111))
           (集(REG:SI 3 R3 [301])
               (加:SI(REG:SI 3 R3 [301])
                   (CONST_INT 32 [0x20的)))
       ]){1605} neon_vld4qav4sf(无))
../main.c:65:从先前的错误,拯救
CS-MAKE:*** [main.o中]错误1


解决方案

好吧,我已经联系了code的Sourcery这个问题,他们已经考虑到这种作为GCC编译器的错误。所以我写了do_it4(){} .....功能的组装,而不是使用德内部函数。现在,它的工作好!

I have a ARM NEON Cortex-A8 based processor target. I was optimizing my code by making use of NEON. But when I compile my code I get this strange error. Don't know how to fix this.

I'm trying to compile the following code (PART 1) using Code Sourcery (PART2) on my host. And I get this strange error (PART3). Am I doing something wrong here? Can anyone else compile this and see if they also get the same compilation error?

The strange part is, in the code if I comment out the else if(step_size == 4) part of the code, then the error vanishes. But, sadly my optimization is not complete with out it, so I must have it.

At first I thought its the problem with CodeSourcey compiler (on my host), so I compiled the program on my target directly (My target runs on Ubuntu). I used gcc there and once again, I get the same error and when I comment out the else if(step_size == 4) part, then the error vanishes.

Help!


PART 1

#include<stdio.h>
#include"arm_neon.h"

#define IMAGE_HEIGHT 480
#define IMAGE_WIDTH  640

float32_t integral_image[IMAGE_HEIGHT][IMAGE_WIDTH];

float32x4_t box_area_compute3(int, int , int , int , unsigned int , float);

inline int min(int, int);

int main()
{

 box_area_compute3(1, 1, 4, 4, 2, 0);

 return 0;
}

float32x4_t box_area_compute3(int row, int col, int num_rows, int num_cols, unsigned int step_size, float three)
{
 unsigned int height = IMAGE_HEIGHT;
 unsigned int width = IMAGE_WIDTH;

 int temp_row = row + num_rows;
 int temp_col = col + num_cols;

 int r1 = (min(row, height))- 1 ;
 int r2 = (min(temp_row, height)) - 1;

 int c1 = (min(col, width)) - 1;
 int c2 = (min(temp_col, width)) - 1;

 float32x4_t v128_areas;

 if(step_size == 2)
 {
  float32x4x2_t top_left, top_right, bottom_left, bottom_right;
  top_left    = vld2q_f32((float32_t *)integral_image[r1] + c1);
  top_right   = vld2q_f32((float32_t *)integral_image[r1] + c2);
  bottom_left  = vld2q_f32((float32_t *)integral_image[r2] + c1);
  bottom_right  = vld2q_f32((float32_t *)integral_image[r2] + c2);

  v128_areas = vsubq_f32(vsubq_f32(vaddq_f32(top_left.val[0], bottom_right.val[0]), top_right.val[0]), bottom_left.val[0]);


 }
 else if(step_size == 4)
 {
  float32x4x4_t top_left, top_right, bottom_left, bottom_right;
  top_left   = vld4q_f32((float32_t *)integral_image[r1] + c1);
  top_right   = vld4q_f32((float32_t *)integral_image[r1] + c2);
  bottom_left  = vld4q_f32((float32_t *)integral_image[r2] + c1);
  bottom_right  = vld4q_f32((float32_t *)integral_image[r2] + c2);

  v128_areas = vsubq_f32(vsubq_f32(vaddq_f32(top_left.val[0], bottom_right.val[0]), top_right.val[0]), bottom_left.val[0]);

 }

 if(three == 3.0)
  v128_areas = vmulq_n_f32(v128_areas, three);

 return v128_areas;

}

inline int min(int X, int Y)
{
 return (X < Y ? X : Y);
}


PART 2

arm-none-linux-gnueabi-gcc -O0 -g3 -Wall -c -fmessage-length=0 -fcommon -MMD -MP -MF"main.d" -MT"main.d" -mcpu=cortex-a8 -marm -mfloat-abi=hard -mfpu=neon-vfpv4 -o"main.o" "../main.c"


PART 3

../main.c: In function 'box_area_compute3':
../main.c:65: error: unable to find a register to spill in class 'GENERAL_REGS'
../main.c:65: error: this is the insn:
(insn 226 225 227 5 c:\program files\codesourcery\sourcery g++\bin\../lib/gcc/arm-none-linux-gnueabi/4.4.1/include/arm_neon.h:9863 (parallel [
           (set (reg:XI 148 [ D.17028 ])
               (unspec:XI [
                       (mem:XI (reg:SI 3 r3 [301]) [0 S64 A64])
                       (reg:XI 148 [ D.17028 ])
                       (unspec:V4SF [
                               (const_int 0 [0x0])
                           ] 191)
                   ] 111))
           (set (reg:SI 3 r3 [301])
               (plus:SI (reg:SI 3 r3 [301])
                   (const_int 32 [0x20])))
       ]) 1605 {neon_vld4qav4sf} (nil))
../main.c:65: confused by earlier errors, bailing out
cs-make: *** [main.o] Error 1

解决方案

Well I had contacted the Code Sourcery about this problem and they have considered this as a bug in GCC compiler. So I wrote the do_it4(){.....} function in assembly instead of using teh intrinsics. Now it works good!

这篇关于未知错误GCC,在编制ARM NEON(严重)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆