GPU / CUDA核心是SIMD的吗? [英] Are GPU/CUDA cores SIMD ones?

查看:548
本文介绍了GPU / CUDA核心是SIMD的吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

让我们采用 nVidia Fermi计算体系结构。它说:


第一个基于Fermi的GPU,实现了30亿个晶体管,具有多达512个CUDA内核。 CUDA内核每个线程的每个时钟执行一个浮点数或整数指令。 512个CUDA内核以16个SM(每个32个内核)进行组织。

The first Fermi based GPU, implemented with 3.0 billion transistors, features up to 512 CUDA cores. A CUDA core executes a floating point or integer instruction per clock for a thread. The 512 CUDA cores are organized in 16 SMs of 32 cores each.

[...]

每个CUDA处理器都具有完整的流水线整数算法。逻辑单元(ALU)和浮点单元(FPU)。

Each CUDA processor has a fully pipelined integer arithmetic logic unit (ALU) and floating point unit (FPU).

[...]

在Fermi中,新设计的整数ALU支持完整的整数。所有指令的32位精度,与标准编程语言要求一致。还优化了整数ALU,以有效支持64位和扩展的精度运算。 V

In Fermi, the newly designed integer ALU supports full 32-bit precision for all instructions, consistent with standard programming language requirements. The integer ALU is also optimized to efficiently support 64-bit and extended precision operations. V

据我所知,我不清楚的是,GPU在所谓的 warps 中执行线程,每个经线由〜32个线程组成。每个扭曲仅分配给一个核心(是吗?)。这是否意味着单个SM的32个内核中的每个内核都是SIMD处理器,其中单个指令处理 32个数据部分?如果是这样,那么为什么我们说一个扭曲中有32个线程,而不是一个SIMD线程?为什么有时将内核称为标量处理器,而不是矢量处理器

From what I know, and what is unclear for me, is that GPUs execute the threads in so called warps, each warp consists of ~32 threads. Each warp is assigned to only one core (is that true?). So does that mean, that each of the 32 cores of a single SM is a SIMD processor, where a single instruction handles 32 data portions ? If so, then why we say there are 32 threads in a warp, not a single SIMD thread? Why cores are sometimes referred to as scalar processors, not vector processors ?

推荐答案


每个扭曲仅分配给一个核(是吗?)。

Each warp is assigned to only one core (is that true?).

不,不是这样。扭曲是32个执行线程的逻辑组合。要从一个warp执行一条指令,warp调度程序通常必须调度 32 执行单元(或 cores,尽管 core的定义有些松散)。

No, it's not true. A warp is a logical assembly of 32 threads of execution. To execute a single instruction from a single warp, the warp scheduler must usually schedule 32 execution units (or "cores", although the definition of a "core" is somewhat loose).

内核实际上是标量处理器,而不是矢量处理器。 warp调度程序将32个内核(或执行单元)编组为一条跨32个线程的单条指令,这就是 SIMT绰号的来源。

Cores are in fact scalar processors, not vector processors. 32 cores (or execution units) are marshalled by the warp scheduler to execute a single instruction, across 32 threads, which is where the "SIMT" moniker comes from.

这篇关于GPU / CUDA核心是SIMD的吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆