OpenGL计算着色器调用 [英] OpenGL Compute Shader Invocations

查看:718
本文介绍了OpenGL计算着色器调用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个与新的计算着色器相关的问题。我目前正在开发一个粒子系统。我将所有我的粒子存储在shader-storage-buffer中以在计算着色器中访问它们。

  #define WORK_GROUP_SIZE 128 
_shaderManager-> useProgram(computeProg );
glDispatchCompute((_ numParticles / WORK_GROUP_SIZE),1,1);
glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT);

我的计算着色器:

  #version 430 
struct particle {
vec4 currentPos;
vec4 oldPos;
};

布局(std430,binding = 0)缓冲粒子{
struct particle p [];
};

布局(local_size_x = 128,local_size_y = 1,local_size_z = 1)
void main(){
uint gid = gl_GlobalInvocationID.x;

p [gid] .currentPos.x + = 100;
}

但是不是所有的粒子都会受到影响。我在做这个例子中做的相同的方式,但它不工作。 http://education.siggraph.org/media/conference/S2012_Materials/ComputeShader_6pp.pdf



编辑:



我调用glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT)后, / p>

  _shaderManager-> useProgram(shaderProg); 
glBindBuffer(GL_ARRAY_BUFFER,shaderStorageBufferID);
glVertexPointer(4,GL_FLOAT,sizeof(glm :: vec4),(void *)0);
glEnableClientState(GL_VERTEX_ARRAY);
glDrawArrays(GL_POINTS,0,_numParticles);
glDisableClientState(GL_VERTEX_ARRAY);因此,在这种情况下,哪个位适合使用?


div class =h2_lin>解决方案

我解决了这个问题。问题只是我派遣的工作组的数量。
numParticles / WORK_GROUP_SIZE将被四舍五入,因为两个变量都是整数。



当我有1000个粒子时,只有1000/128 = 7个工作组被分派。每个工作组的大小为128.这意味着我得到7 * 128 = 896个线程,因此104个粒子根本不会移动。
由于numParticles%128可能在0 ... 128之间,我刚刚分派了一个工作组:

  glDispatchCompute ((_numParticles / WORK_GROUP_SIZE)+1,1,1); 

每个粒子都从现在开始。 :)


I got a question related to the new compute shaders. I am currently working on a particle system. I store all my particles in shader-storage-buffer to access them in the compute shader. Then I dispatch an one dimensional work group.

#define WORK_GROUP_SIZE 128
_shaderManager->useProgram("computeProg");
glDispatchCompute((_numParticles/WORK_GROUP_SIZE), 1, 1);
glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT);

My compute shader:

#version 430
struct particle{
         vec4 currentPos;
         vec4 oldPos;
};

layout(std430, binding=0) buffer particles{
         struct particle p[];
};

layout (local_size_x = 128, local_size_y = 1, local_size_z = 1) in;
void main(){
         uint gid = gl_GlobalInvocationID.x;

         p[gid].currentPos.x += 100;
}

But somehow not all particles are affected. I am doing it the same way it was done in this example, but it doesn't work. http://education.siggraph.org/media/conference/S2012_Materials/ComputeShader_6pp.pdf

Edit:

After I called glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT) I go on like this:

_shaderManager->useProgram("shaderProg"); 
glBindBuffer(GL_ARRAY_BUFFER, shaderStorageBufferID); 
glVertexPointer(4,GL_FLOAT,sizeof(glm::vec4), (void*)0);
glEnableClientState(GL_VERTEX_ARRAY); 
glDrawArrays(GL_POINTS, 0, _numParticles); 
glDisableClientState(GL_VERTEX_ARRAY);

So which bit would be appropriate to use in this case?

解决方案

I resolved the problem. The problem was just the number of work-groups I dispatched. numParticles/WORK_GROUP_SIZE will be round off because both variables are integers. That caused too little dispatched work-groups with different numbers of particles.

When I got 1000 particles, then only 1000/128 = 7 work-groups are dispatched. Every work-group has the size of 128. That means I get 7*128 = 896 threads and thus 104 particles won't move at all. Since numParticles%128 may range from 0...128 I just dispatched one more work-group:

glDispatchCompute((_numParticles/WORK_GROUP_SIZE)+1, 1, 1);

And every particle moves from now on. :)

这篇关于OpenGL计算着色器调用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆