如何使用顶点缓冲区将计算着色器结果馈入不包含顶点着色器的顶点中? [英] How can I feed compute shader results into vertex shader w/o using a vertex buffer?

查看:137
本文介绍了如何使用顶点缓冲区将计算着色器结果馈入不包含顶点着色器的顶点中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在详细介绍问题之前,我先概述一下问题:

我使用RWStructuredBuffers存储我的计算着色器(CS)的输出.由于无法从RWStructuredBuffers读取顶点和像素着色器,因此我将StructuredBuffer映射到同一插槽(u0/t0)和(u4/t4):

cbuffer cbWorld : register (b1) 
{
    float4x4 worldViewProj;
    int dummy;
}   

struct VS_IN
{
    float4 pos : POSITION;
    float4 col : COLOR;
};

struct PS_IN
{

    float4 pos : SV_POSITION;
    float4 col : COLOR;
};

RWStructuredBuffer<float4> colorOutputTable : register (u0);    // 2D color data
StructuredBuffer<float4> output2 :            register (t0);    // same as u0
RWStructuredBuffer<int> counterTable :        register (u1);    // depth data for z values
RWStructuredBuffer<VS_IN>vertexTable :        register (u4);    // triangle list
StructuredBuffer<VS_IN>vertexTable2 :         register (t4);    // same as u4

我使用ShaderRecourceView授予像素和/或顶点着色器访问缓冲区的权限.这个概念对我的像素着色器来说效果很好,但是顶点着色器似乎只能读取0个值(我使用SV_VertexID作为缓冲区的索引):

PS_IN VS_3DA ( uint vid : SV_VertexID ) 
{           
    PS_IN output = (PS_IN)0; 
    PS_IN input = vertexTable2[vid];
    output.pos = mul(input.pos, worldViewProj); 
    output.col = input.col; 
    return output;
}

hlsl编译器没有错误消息或警告,renderloop以60 fps(使用vsync)运行,但屏幕保持黑色.由于我在调用Draw(..)之前用Color.White清空了屏幕,因此渲染管道似乎处于活动状态.

当我通过UAView从GPU将三角形数据内容读取到"vertArray"中并将其反馈回顶点缓冲区时,一切正常:

程序:

    let vertices = Buffer.Create(device, BindFlags.VertexBuffer, vertArray)
    context.InputAssembler.SetVertexBuffers(0, new VertexBufferBinding(vertices, Utilities.SizeOf<Vector4>() * 2, 0))

HLSL:

PS_IN VS_3D (VS_IN input )
{
    PS_IN output = (PS_IN)0;    
    output.pos = mul(input.pos, worldViewProj);
    output.col = input.col; 
    return output;
}

此处是2D的定义-顶点/像素着色器.请注意,PS_2D访问插槽t0中的缓冲区"output2"-正是我要为3D顶点着色器"VS_3DA"复制的技巧":

float4 PS_2D ( float4 input : SV_Position) : SV_Target
{        
    uint2 pixel =  uint2(input.x, input.y);         
    return output2[ pixel.y * width + pixel.x]; 
}

float4 VS_2D ( uint vid : SV_VertexID ) : SV_POSITION
{
if (vid == 0)
    return float4(-1, -1, 0, 1);
if (vid == 1)
    return float4( 1, -1, 0, 1);
if (vid == 2)
    return float4(-1,  1, 0, 1);    

return float4( 1,  1, 0, 1);    
}

三天来,我一直搜索并尝试无济于事.我收集的所有信息似乎都证实了我使用SV_VertexID的方法应该可行.

有人可以提供建议吗?感谢您阅读我的帖子!

================================================ =====================

详细信息:

我非常喜欢DirectX 11计算着色器的概念,我想将其用于代数计算.作为测试用例,我以3D渲染分形(Mandelbrot集).一切都按预期方式进行-除了墙上的最后一块砖头丢失了.

计算采用以下步骤:

  1. 使用CS计算2D纹理(输出为"counterTable"和"colorOutbutTable"(有效)

  2. (可选)将此纹理渲染到屏幕上(有效)

  3. 使用另一个CS生成网格(三角形列表).该CS从步骤1中获取x,y和颜色值,计算z坐标,最后为每个像素创建一个四边形.结果存储在"vertexTable"中. (有效)

  4. 将三角形列表输入到顶点着色器(问题!!!)

  5. 渲染到屏幕(有效-使用顶点缓冲区).

对于编程,我使用F#3.0和SharpDX作为.NET包装器. 使用相同的参数(大小参数除外)设置两个着色器(像素和顶点)的ShaderRessourceView:

let mutable descr = new BufferDescription()     
descr.BindFlags <- BindFlags.UnorderedAccess ||| BindFlags.ShaderResource 
descr.Usage <- ResourceUsage.Default  
descr.CpuAccessFlags <- CpuAccessFlags.None
descr.StructureByteStride <- xxx    / / depends on shader
descr.SizeInBytes <-  yyy       / / depends on shader
descr.OptionFlags <- ResourceOptionFlags.BufferStructured

这里没什么特别的. 创建2D缓冲区(绑定到插槽t0中的缓冲区"output2"):

outputBuffer2D <- new Buffer(device, descr) 
outputView2D <- new UnorderedAccessView (device, outputBuffer2D)  
shaderResourceView2D <- new ShaderResourceView (device, outputBuffer2D)

创建3D缓冲区(绑定到插槽t4中的"vertexTable2"):

vertexBuffer3D <- new Buffer(device, descr) 
shaderResourceView3D <- new ShaderResourceView (device, vertexBuffer3D)
//  UAView not required here

设置2D资源:

context.InputAssembler.PrimitiveTopology <- PrimitiveTopology.TriangleStrip
context.OutputMerger.SetRenderTargets(renderTargetView2D)
context.OutputMerger.SetDepthStencilState(depthStencilState2D)
context.VertexShader.Set (vertexShader2D)
context.PixelShader.Set (pixelShader2D) 

渲染2D:

context.PixelShader.SetShaderResource(COLOR_OUT_SLOT, shaderResourceView2D)
context.PixelShader.SetConstantBuffer(CONSTANT_SLOT_GLOBAL, constantBuffer2D )
context.ClearRenderTargetView (renderTargetView2D, Color.White.ToColor4())         
context.Draw(4,0)                                                
swapChain.Present(1, PresentFlags.None)            

设置3D资源:

context.InputAssembler.PrimitiveTopology <- PrimitiveTopology.TriangleList
context.OutputMerger.SetTargets(depthView3D, renderTargetView2D)
context.VertexShader.SetShaderResource(TRIANGLE_SLOT, shaderResourceView3D )
context.VertexShader.SetConstantBuffer(CONSTANT_SLOT_3D, constantBuffer3D)
context.VertexShader.Set(vertexShader3D)
context.PixelShader.Set(pixelShader3D)

渲染3D(不起作用-黑屏作为输出结果)

context.ClearDepthStencilView(depthView3D, DepthStencilClearFlags.Depth, 1.0f, 0uy)
context.Draw(dataXsize * dataYsize * 6, 0)
swapChain.Present(1, PresentFlags.None)

最后是插槽号:

static let CONSTANT_SLOT_GLOBAL = 0
static let CONSTANT_SLOT_3D = 1
static let COLOR_OUT_SLOT = 0
static let COUNTER_SLOT = 1
static let COLOR_SLOT = 2    
static let TRIANGLE_SLOT = 4

解决方案

好吧,我建议的第一件事是打开调试层(在创建设备时使用Debug标志),然后转到项目属性,debug选项卡,并勾选启用非托管代码调试"或启用本机代码调试".

当您开始调试程序时,如果管道状态有问题,运行时将向您发出潜在的警告.

一个潜在的问题(从您发布的内容来看,这很可能是一个问题): 调度后,请确保清理您的计算着色器UAV插槽.如果您尝试将vertexTable2绑定到您的顶点着色器,但是资源仍被绑定为计算着色器输出,则运行时将自动将您的ShaderView设置为null(当您尝试读取它时,它将返回0).

要清洁您的Compute Shader,请在设备上下文中调用此方法,即已完成调度:

ComputeShader.SetUnorderedAccessView(TRIANGLE_SLOT, null)

还请注意,PixelShader可以访问RWStructuredBuffer(从技术上讲,如果您具有功能级别11.1,则意味着可以将RWStructuredBuffer用于任何着色器类型,这意味着需要使用最新的ATI卡和Windows 8 +).

Before I go into details I want outline the problem:

I use RWStructuredBuffers to store the output of my compute shaders (CS). Since vertex and pixel shaders can’t read from RWStructuredBuffers, I map a StructuredBuffer onto the same slot (u0/t0) and (u4/t4):

cbuffer cbWorld : register (b1) 
{
    float4x4 worldViewProj;
    int dummy;
}   

struct VS_IN
{
    float4 pos : POSITION;
    float4 col : COLOR;
};

struct PS_IN
{

    float4 pos : SV_POSITION;
    float4 col : COLOR;
};

RWStructuredBuffer<float4> colorOutputTable : register (u0);    // 2D color data
StructuredBuffer<float4> output2 :            register (t0);    // same as u0
RWStructuredBuffer<int> counterTable :        register (u1);    // depth data for z values
RWStructuredBuffer<VS_IN>vertexTable :        register (u4);    // triangle list
StructuredBuffer<VS_IN>vertexTable2 :         register (t4);    // same as u4

I use a ShaderRecourceView to grant pixel and/or vertex shader access to the buffers. This concept works fine for my pixel shader, the vertex shader however seems to read only 0 values (I use SV_VertexID as index to the buffers):

PS_IN VS_3DA ( uint vid : SV_VertexID ) 
{           
    PS_IN output = (PS_IN)0; 
    PS_IN input = vertexTable2[vid];
    output.pos = mul(input.pos, worldViewProj); 
    output.col = input.col; 
    return output;
}

No error messages or warnings from the hlsl compiler, the renderloop runs with 60 fps (using vsync), but the screen remains black. Since I blank the screen with Color.White before Draw(..) is called, the render pipeline seems to be active.

When I read the triangle data content via an UAView from the GPU into "vertArray" and feed it back into a vertex buffer, everything works however:

Program:

    let vertices = Buffer.Create(device, BindFlags.VertexBuffer, vertArray)
    context.InputAssembler.SetVertexBuffers(0, new VertexBufferBinding(vertices, Utilities.SizeOf<Vector4>() * 2, 0))

HLSL:

PS_IN VS_3D (VS_IN input )
{
    PS_IN output = (PS_IN)0;    
    output.pos = mul(input.pos, worldViewProj);
    output.col = input.col; 
    return output;
}

Here the definition of the 2D - Vertex / Pixelshaders. Please note that PS_2D accesses the buffer "output2" in slot t0 - and that's exactly the "trick" what I want to replicate for then 3D vertex shader "VS_3DA":

float4 PS_2D ( float4 input : SV_Position) : SV_Target
{        
    uint2 pixel =  uint2(input.x, input.y);         
    return output2[ pixel.y * width + pixel.x]; 
}

float4 VS_2D ( uint vid : SV_VertexID ) : SV_POSITION
{
if (vid == 0)
    return float4(-1, -1, 0, 1);
if (vid == 1)
    return float4( 1, -1, 0, 1);
if (vid == 2)
    return float4(-1,  1, 0, 1);    

return float4( 1,  1, 0, 1);    
}

For three days I have searched and experimented to no avail. All informations I gathered seem to confirm that my approach using then SV_VertexID should work.

Can anybody give advice? Thanks for reading my post!

=====================================================================

DETAILS:

I like the concept of DirectX 11 compute shaders very much and I want to employ it for algebraic computing. As a test case I render fractals (Mandelbrot sets) in 3D. Everything works as expected – except one last brick in the wall is missing.

The computation takes the following steps:

  1. Using a CS to compute a 2D texture (output is "counterTable" and "colorOutbutTable" (works)

  2. Optionally render this texture to screen (works)

  3. Using another CS to generate a mesh (triangle list). This CS takes x, y, and color values from step 1, computes the z coordinate, and finally creates a quad for each pixel. The result is stored in "vertexTable". (works)

  4. Feeding the triangles list to the vertex shader (problem!!!)

  5. Render to screen (works - using a vertex buffer).

For programming I use F# 3.0 and SharpDX as .NET wrapper. The ShaderRessourceView for both shaders (pixel & vertex) is set up with the same parameters (except the size parameters):

let mutable descr = new BufferDescription()     
descr.BindFlags <- BindFlags.UnorderedAccess ||| BindFlags.ShaderResource 
descr.Usage <- ResourceUsage.Default  
descr.CpuAccessFlags <- CpuAccessFlags.None
descr.StructureByteStride <- xxx    / / depends on shader
descr.SizeInBytes <-  yyy       / / depends on shader
descr.OptionFlags <- ResourceOptionFlags.BufferStructured

Nothing special here. Creation of 2D buffer (binds to buffer "output2" in slot t0):

outputBuffer2D <- new Buffer(device, descr) 
outputView2D <- new UnorderedAccessView (device, outputBuffer2D)  
shaderResourceView2D <- new ShaderResourceView (device, outputBuffer2D)

Creation of 3D buffer (binds to "vertexTable2" in slot t4):

vertexBuffer3D <- new Buffer(device, descr) 
shaderResourceView3D <- new ShaderResourceView (device, vertexBuffer3D)
//  UAView not required here

Setting resources for 2D:

context.InputAssembler.PrimitiveTopology <- PrimitiveTopology.TriangleStrip
context.OutputMerger.SetRenderTargets(renderTargetView2D)
context.OutputMerger.SetDepthStencilState(depthStencilState2D)
context.VertexShader.Set (vertexShader2D)
context.PixelShader.Set (pixelShader2D) 

render 2D:

context.PixelShader.SetShaderResource(COLOR_OUT_SLOT, shaderResourceView2D)
context.PixelShader.SetConstantBuffer(CONSTANT_SLOT_GLOBAL, constantBuffer2D )
context.ClearRenderTargetView (renderTargetView2D, Color.White.ToColor4())         
context.Draw(4,0)                                                
swapChain.Present(1, PresentFlags.None)            

Setting resources for 3D:

context.InputAssembler.PrimitiveTopology <- PrimitiveTopology.TriangleList
context.OutputMerger.SetTargets(depthView3D, renderTargetView2D)
context.VertexShader.SetShaderResource(TRIANGLE_SLOT, shaderResourceView3D )
context.VertexShader.SetConstantBuffer(CONSTANT_SLOT_3D, constantBuffer3D)
context.VertexShader.Set(vertexShader3D)
context.PixelShader.Set(pixelShader3D)

render 3D (doesn’t work – black screen as output result)

context.ClearDepthStencilView(depthView3D, DepthStencilClearFlags.Depth, 1.0f, 0uy)
context.Draw(dataXsize * dataYsize * 6, 0)
swapChain.Present(1, PresentFlags.None)

Finally the slot numbers:

static let CONSTANT_SLOT_GLOBAL = 0
static let CONSTANT_SLOT_3D = 1
static let COLOR_OUT_SLOT = 0
static let COUNTER_SLOT = 1
static let COLOR_SLOT = 2    
static let TRIANGLE_SLOT = 4

解决方案

Ok first thing I would suggest, is to turn on debug layer (Use Debug flag when you create your device), then go to project properties, debug tab, and tick "Enable unmanaged code debugging" or "Enable native code debugging".

When you start to debug the program the runtime will give you potential warnings if something wrong with pipeline state.

One potential issue (which looks the most likely one from what you posted): Make sure to clean your compute shader UAV slots after dispatching. If you try to bind vertexTable2 to your vertex shader, but the resource is still bound as compute shader output, the runtime will automatically set your ShaderView to null (which will in turn return 0 when you try to read it).

To clean your Compute Shader, call this on your device context one you're done with dispatch:

ComputeShader.SetUnorderedAccessView(TRIANGLE_SLOT, null)

Please also note that PixelShader can access RWStructuredBuffer (technically you can use RWStructuredBuffer for any shader type if you have feature level 11.1, that means recent ATI card and Windows 8+).

这篇关于如何使用顶点缓冲区将计算着色器结果馈入不包含顶点着色器的顶点中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆