XNA - 同时创建大量的粒子 [英] XNA - Creating a lot of particles at the same time

查看:146
本文介绍了XNA - 同时创建大量的粒子的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

另一个XNA问题的时间。这一次纯粹是从技术设计的角度来看。



我的情况是这样的:我已经创建了一个基于GPU的粒子引擎计算,远不完整,但它的作品。我的GPU很容易处理10k粒子,而不会打破汗水,如果我可以添加一些,我不会感到惊讶。



我的问题:每当我同时创建大量的粒子时,我的帧速率就会讨厌我。为什么?很多CPU使用率,尽管我已将其最小化几乎只包含内存操作。



创建粒子仍然通过CPU调用完成,如: p>


  • 方法想创建粒子并进行调用。

  • 四角形以顶点的形式创建并存储在缓冲区中

  • 缓冲区插入到GPU中,我的CPU可以专注于其他事情



当我有大约4个发射器每帧创建一个粒子时,我的FPS会降低(确实,每秒只有4帧,但是15个发射器将我的FPS降到25)。



创建一个粒子:

  // ###正如你所看到的,这里没有太多的动作。 ### 
ParticleVertex [] tmpVertices = ParticleQuad.Vertices(Position,Velocity,this.TimeAlive);
particleVertices [i] = tmpVertices [0];
particleVertices [i + 1] = tmpVertices [1];
particleVertices [i + 2] = tmpVertices [2];
particleVertices [i + 3] = tmpVertices [3];
particleVertices [i + 4] = tmpVertices [4];
particleVertices [i + 5] = tmpVertices [5];

particleVertexBuffer.SetData(particleVertices);

我的想法是,也许我不应该经常创建粒子,也许有一种方法让GPU创造一切,或者也许我只是不知道你如何做这些东西。 ;)



编辑:如果我不经常创建粒子,那么仍然使它看起来不错的解决方法是什么?



所以我发布在这里希望你知道如何设计一个好的粒子引擎,如果也许我在某个地方采取了错误的路由。

解决方案

没有办法让GPU创建所有内容(没有使用几何着色器,需要SM4.0)。



如果我正在创建一个粒子系统以获得最大的CPU效率,我将预先创建(只是为了选择一个数字)例如:顶点和索引缓冲区中的100个粒子,如下所示:




  • 创建顶点缓冲区包含四边形(每个粒子四个顶点,不是六个顶点)

  • 使用可以存储时间偏移值的自定义顶点格式以及初始速度值类似于 XNA Particle 3D示例

  • 设置时间值,使得每个粒子具有比最后一个小的1/100的时间偏移(因此通过缓冲区的偏移范围从1.0到0.01)。

  • 随机设置初始速度。 li>
  • 使用索引缓冲区,为每个粒子的四个顶点提供您需要的两个三角形。



<很酷的事情是,你只需要这样做一次 - 你可以为所有的粒子系统重新使用相同的顶点缓冲区和索引缓冲区(提供它们对于你最大的粒子系统来说足够大)。



然后我会有一个顶点着色器,它将采用以下输入:




  • 顶点:


    • 时间偏移

    • 初始速度


  • 着色器参数


    • 当前时间

    • 粒子生命周期(也是粒子时间循环值,以及使用的缓冲区中粒子的分数)

    • 粒子系统位置/旋转/缩放(世界矩阵)

    • 您喜欢的任何其他有趣的输入,例如:粒度,重力,风等等。

    • 一个时间尺度得到一个实时的,所以速度和其他物理计算是有意义的)




顶点着色器(再次像 XNA Particle 3D示例)可以确定位置基于其初始速度和该粒子在模拟中的时间的一个粒子的顶点。



每个粒子的时间将是(伪代码):

  time =(currentTime + timeOffset)%particleLifetime; 

换句话说,随着时间的推移,粒子将以恒定的速率释放(由于偏移)。每当一个粒子在 time = particleLifetime (或是在1.0?浮点模数是混淆)时死亡,时间循环回到 time = 0.0 ,以便粒子重新输入动画。



然后,当画出粒子的时候,我将使我的缓冲区,着色器和着色器参数设置,并调用 DrawIndexedPrimitives 。现在这里是聪明的一点:我将设置 startIndex primitiveCount ,使得没有粒子从中间动画开始。当粒子系统第一次启动时,我会绘制1个粒子(2个原始图像),并且在粒子即将死亡的时候,我将绘制所有100个粒子,其中第100个粒子将刚刚开始。



然后,稍后,第一颗粒子的计时器将循环并使其成为第101颗粒子。



(如果我只在我的系统中需要50个粒子,我只是将粒子寿命设置为0.5,只有在顶点/索引缓冲区中绘制100个粒子的前50个。)



当关闭粒子系统的时候,只需简单地做相反的操作 - 设置 startIndex primitiveCount 这样颗粒在死后就停止绘制。



现在我必须承认,我已经掩盖了所涉及的数学和关于使用四边形粒子的一些细节 - 但是不应该太难以弄清楚。 理解的基本原理是,您将顶点/索引缓冲区视为颗粒的循环缓冲区。



循环的一个缺点缓冲区是,当您停止发射粒子时,除非当前时间是粒子寿命的倍数时停止,否则最终将以跨越缓冲区末端的活动的粒子集合在中间有间隙 - 因此需要两个绘制电话(有点慢)。为了避免这种情况,您可以等待到停止之前的时间 - 对于大多数系统来说,这应该是可以的,但对于某些(例如:需要立即停止的慢粒子系统)可能看起来很奇怪。

$ b这种方法的另一个缺点是粒子必须以恒定的速率释放 - 尽管这通常对于粒子系统是非常典型的(显然这是每个系统的,并且速率是可调整的)。有一点调整爆炸效应(所有颗粒一次释放)应该是可能的。



所有说的话:如果可能,可能值得使用现有的粒子库。


time for another XNA question. This time it is purely from a technical design standpoint though.

My situation is this: I've created a particle-engine based on GPU-calculations, far from complete but it works. My GPU easily handles 10k particles without breaking a sweat and I wouldn't be surprised if I could add a bunch more.

My problem: Whenever I have a lot of particles created at the same time, my frame rate hates me. Why? A lot of CPU-usage, even though I have minimized it to contain almost only memory operations.

Creation of particles is still done by CPU-calls such as:

  • Method wants to create particle and makes a call.
  • Quad is created in form of vertices and stored in a buffer
  • Buffer is inserted into GPU and my CPU can focus on other things

When I have about 4 emitters creating one particle per frame, my FPS lowers (sure, only 4 frames per seconds but 15 emitters drops my FPS to 25).

Creation of a particle:

        //### As you can see, not a lot of action here. ###
        ParticleVertex []tmpVertices = ParticleQuad.Vertices(Position,Velocity,this.TimeAlive);
        particleVertices[i] = tmpVertices[0];
        particleVertices[i + 1] = tmpVertices[1];
        particleVertices[i + 2] = tmpVertices[2];
        particleVertices[i + 3] = tmpVertices[3];
        particleVertices[i + 4] = tmpVertices[4];
        particleVertices[i + 5] = tmpVertices[5];

        particleVertexBuffer.SetData(particleVertices);

My thoughts are that maybe I shouldn't create particles that often, maybe there is a way to let the GPU create everything, or maybe I just don't know how you do these stuff. ;)

Edit: If I weren't to create particles that often, what is the workaround for still making it look good?

So I am posting here in hope that you know how a good particle-engine should be designed and if maybe I took the wrong route somewhere.

解决方案

There is no way to have the GPU create everything (short of using Geometry Shaders which requires SM4.0).

If I were creating a particle system for maximum CPU efficiency, I would pre-create (just to pick a number for sake of example) 100 particles in a vertex and index buffer like this:

  • Make a vertex buffer containing quads (four vertices per particle, not six as you have)
  • Use a custom vertex format which can store a "time offset" value, as well as a "initial velocity" value (similar to the XNA Particle 3D Sample)
  • Set the time value such that each particle has a time offset of 1/100th less than the last one (so offsets range from 1.0 to 0.01 through the buffer).
  • Set the initial velocity randomly.
  • Use an index buffer that gives you the two triangles you need using the four vertices for each particle.

And the cool thing is that you only need to do this once - you can reuse the same vertex buffer and index buffer for all your particle systems (providing they are big enough for your largest particle system).

Then I would have a vertex shader that would take the following input:

  • Per-Vertex:
    • Time offset
    • Initial velocity
  • Shader Parameters:
    • Current time
    • Particle lifetime (which is also the particle time wrap-around value, and the fraction of particles in the buffer being used)
    • Particle system position/rotation/scale (the world matrix)
    • Any other interesting inputs you like, such as: particle size, gravity, wind, etc
    • A time scale (to get a real time, so velocity and other physics calculations make sense)

That vertex shader (again like the XNA Particle 3D Sample) could then determine the position of a particle's vertex based on its initial velocity and the time that that particle had been in the simulation.

The time for each particle would be (pseudo code):

time = (currentTime + timeOffset) % particleLifetime;

In other words, as time advances, particles will be released at a constant rate (due to the offset). And whenever a particle dies at time = particleLifetime (or is it at 1.0? floating-point modulus is confusing), time loops back around to time = 0.0 so that the particle re-enters the animation.

Then, when it came time to draw my particles, I would have my buffers, shader and shader parameters set, and call DrawIndexedPrimitives. Now here's the clever bit: I would set startIndex and primitiveCount such that no particle starts out mid-animation. When the particle system first starts I'd draw 1 particle (2 primitives), and by the time that particle is about to die, I'd be drawing all 100 particles, the 100th of which would just be starting.

Then, a moment later, the 1st particle's timer would loop around and make it the 101st particle.

(If I only wanted 50 particles in my system, I'd just set my particle lifetime to 0.5 and only ever draw the first 50 of the 100 particles in the vertex/index buffer.)

And when it came time to turn off the particle system - simply do the same in reverse - set the startIndex and primitiveCount such that particles stop being drawn after they die.

Now I must admit that I've glossed over the maths involved and some details about using quads for particles - but it should not be too hard to figure out. The basic principle to understand is that you're treating your vertex/index buffer as a circular buffer of particles.

One downside of a circular buffer is that, when you stop emitting particles, unless you stop when the current time is a multiple of the particle lifetime, you will end up with the active set of particles straddling the ends of the buffer with a gap in the middle - thus requiring two draw calls (a bit slower). To avoid this you could wait until the time is right before stopping - for most systems this should be ok, but might look weird for some (eg: a "slow" particle system that needs to stop instantly).

Another downside to this method is that particles must be released at a constant rate - although that is usually pretty typical for particle systems (obviously this is per-system and the rate is adjustable). With a little tweaking an explosion effect (all particles released at once) should be possible.

All that being said: If possible, it may be worthwhile using an existing particle library.

这篇关于XNA - 同时创建大量的粒子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆