在OpenACC中正确使用device_type [英] Correct use of device_type in OpenACC

查看:87
本文介绍了在OpenACC中正确使用device_type的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 for 循环,如果目标硬件是NVIDIA,我想将它与OpenACC并行化,或者当目标硬件是AMD时,以串行方式运行它.我尝试了以下方法:

I have a for loop and I want to parallelize it with OpenACC if the target hardware is NVIDIA, or run it serially when the target hardware is AMD. I tried the following:

#pragma acc loop \
    device_type(tesla) parallel \
    device_type(radeon) seq
for (int z = 0; z < size_z; ++z)
{
    // do stuff...
}

编译为: pgc ++ -std = c ++ 11 -O4 -ta = tesla -Minfo:accel main.cpp

但是在并行化报告中,我得到:< line_number> ;, #pragma acc循环序列

But on the parallelization report I get: <line_number>, #pragma acc loop seq

似乎编译器仅考虑指令的最后一行.知道为什么会这样吗?

It appears that the compiler only takes into account the last line of the directive. Any idea why is this happening?

运行 pgc ++ --version 会显示以下内容:

pgc ++ 16.10-0 x86-64 Linux -tp sandybridge上的64位目标

推荐答案

您正确使用了"device_type",但我们(PGI)仍然缺少一些OpenACC功能,包括通过"device_type"子句定义多个循环时间表.PGI发行说明的第4.4节列出了当前的限制: http://www.pgroup.com/doc/pgirn-x64.pdf

You're using "device_type" correctly but we (PGI) are still missing a few OpenACC features including defining multiple loop schedules via the "device_type" clause. The current limitations are listed in section 4.4 of the PGI release notes: http://www.pgroup.com/doc/pgirn-x64.pdf

这篇关于在OpenACC中正确使用device_type的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆