安卓的OpenGL 2.0雪碧优化 [英] Android OpenGL 2.0 Sprite Optimization

查看:180
本文介绍了安卓的OpenGL 2.0雪碧优化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

HEJ,

我在OpenGL ES 2.0中创建简单的游戏为Android。 游戏将包含几类不同的精灵,但这些会出现不止一次。

现在,让我们说,我有1物体(精灵)。到目前为止,我已经实现了VBO和索引缓冲,这样一个对象作为整体存储在GPU的,据我所知。

我想现在要做的就是绘制这个对象多次,唯一不同的是位置。 至于现在,这是实现如下:

  glBindBuffer(GL_ELEMENT_ARRAY_BUFFER,indexBuffer.getBufferId());
的for(int i = 0; I< 1000;我++){
    multiplyMM(MVP,0,viewMatrix,0,tempGetRandomMVPMatrix(),0);
    glUniformMatrix4fv(uMatrixLocation,1,假的,最有价值球员,0); // TODO


    如果(androidVersion> Build.VERSION_ codeS.FROYO)
        与glDrawElements(GL_TRIANGLES,indexArray.length,GL_UNSIGNED_SHORT,0);
    其他{
        如果(repairedGL20 == NULL){
            repairedGL20 =新AndroidGL20();
        }
        repairedGL20.glDrawElements(GL_TRIANGLES,indexArray.length,GL_UNSIGNED_SHORT,0);
    }

}

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER,0);
 

如果我理解正确的主要问题是被称为每次我改变MVP矩阵与glDrawElements调用。有没有办法将所有MVP矩阵为GPU绘制一个元​​素多次出现,只有1呼吁对与glDrawElements?

有关对象的一些详细信息。 它总共有24个顶点和纹理64×64。 目前有1K的对象我有35FPS,我想获得更高的FPS,因为我会画一些更多的精灵。

下面是我的着色器: 顶点:

 统一mat4 u_Matrix;

属性vec4 a_Position;
属性VEC2 a_TextureCoordinates;

不同VEC2 v_TextureCoordinates;

无效的主要(){
    v_TextureCoordinates = a_TextureCoordinates;
    GL_POSITION = u_Matrix * a_Position;
}
 

片段:

  precision mediump浮动;

统一sampler2D u_TextureUnit; //实际的纹理数据
不同VEC2 v_TextureCoordinates;

无效的主要(){
    gl_FragColor =的Texture2D(u_TextureUnit,v_TextureCoordinates);
}
 

还有一点我不太明白有关的纹理。如果我创建的纹理是这样的:

  glBindTexture(GL_TEXTURE_2D,textureObjectIds [0]); //结合质感纹理对象

        glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MIN_FILTER,GL_LINEAR_MIPMAP_LINEAR); //最小化过滤器
        glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MAG_FILTER,GL_LINEAR);

        texImage2D(GL_TEXTURE_2D,0,位图,0); //发送纹理数据到OpenGL到当前绑定的对象
 

当我将图形对象与此纹理,在那里质地保存,对CPU的内存或GPU?此外,在上面的例子在那里我画一样的精灵,是纹理被发送到每个GPU绘图调用?如果是这样,有什么办法来优化它(类似的东西来VBO)?

解决方案
  

举例来说可以说我有1对象,2个顶点,我想转换为2个地方与MVP。我怎样才能使GL_POSITION在同一时间被绘制在2个不同的地方?

这是绝不可能做到。我指的是在我的初步意见是像一个顶点缓冲区满正多(大概4)每个精灵的顶点......好像除了在顶点缓冲区的位置,你有一个额外的字段,可以转换使用不同的矩阵中的着色顶点。你将永远有确定每个对象的N个顶点,但你可以使用一个静态的顶点缓冲要做到这一点,因为我会在下面解释。

假设您有由点定义(转型前)精灵:

 位置
   < 0.0,0.0>
   < 1.0,0.0>
   < 1.0,1.0>
   < 0.0,1.0>
 

如果您的模型视图矩阵包括扩展的信息,您可以使用相同的一组点了个遍,顺便说一句。

现在,如果你想扩展这个画3个精灵在一个单一的电话,你可以这样做:

 位置并按Idx
  < 0.0,0.0> [0]
  < 1.0,0.0> [0]
  < 1.0,1.0> [0]
  < 0.0,1.0> [0]

  < 0.0,0.0> [1]
  < 1.0,0.0> [1]
  < 1.0,1.0> [1]
  < 0.0,1.0> [1]

  < 0.0,0.0> [2]
  < 1.0,0.0> [2]
  < 1.0,1.0> [2]
  < 0.0,1.0> [2]
 

基本上我添加了一个新的领域顶点缓冲区,其中确定每组4分属于精灵;使用 GLubyte 以获得最佳性能。你会发现,同样的点的集合重复一遍又一遍。实施实例化这种方式有效地增加了存储需求来 O(N * V + 4N),其中V是原始顶点数据结构进行4点的大小,N为精灵的数量。

您可以去尽可能定义包含足够的积分为16的精灵顶点缓冲​​区,然后当你想绘制多个精灵在一个单一的电话,你会始终使用同一个顶点缓冲区,并使用总的一个子集点数。要绘制使用这个顶点缓冲区4精灵,简单地画前16出64个点的它包含总和。

现在,这是只有一半的过程。您还需要设置你的顶点着色器把模型视图矩阵制服定义每个精灵转型的数组。

下面是一个例子顶点着色器,可以用来执行此操作:

  #version 100

统一mat4 proj_mat;
均匀mat4 instanced_mv [16];

属性vec4 vtx_pos;
属性VEC2 vtx_st;
属性浮动vtx_sprite_idx; //这将是桌面GLSL一个UINT

不同VEC2 tex_st;

无效的主要(无效){
  GL_POSITION = proj_mat * instanced_mv [(INT)vtx_sprite_idx] * vtx_pos;
  tex_st = vtx_st;
}
 

有一些东西,这里要注意:

  1. GLES 2.0不支持整数顶点属性,因此指数必须是浮点

    • 在索引统一阵列必须做到使用整数EX pressions的顶点着色器,所以转换是必要的。

  2. 模型视图制服的数是真正的限制因素有多少的精灵,你可以实例一次

    • 在顶点着色器只需要支持128 4组分均匀变量(一个 mat4 计为4 4组分制服),所以这意味着,如果你的顶点着色器中有< STRONG>的只有的模型视图制服可以支持32(128/4)的最大阵列。

  3. 精灵索引存储为 GLubyte 在你的顶点缓冲,但要确保您不要启用浮点正常化当您设置你的顶点ATTRIB。指针。

最后,这个shader没有经过测试。如果你有麻烦这里了解任何东西,或运行到实施这个问题,请随时发表评论。

Hej,

I'm creating simple game for Android in OpenGL ES 2.0. The game will contain few types of different sprites, but these will occur more than once.

For now let's say I have 1 object (sprite). So far I've implemented VBO and Index buffering, so an object as whole is stored on GPU, as I understand.

What I would like to do now is to draw this object multiple times, only thing differing it's position. As for now, this is implemented as follows:

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, indexBuffer.getBufferId());
for(int i=0; i<1000; i++){
    multiplyMM(MVP, 0, viewMatrix, 0, tempGetRandomMVPMatrix(), 0);
    glUniformMatrix4fv(uMatrixLocation, 1, false, MVP, 0);//TODO


    if(androidVersion > Build.VERSION_CODES.FROYO)
        glDrawElements(GL_TRIANGLES, indexArray.length, GL_UNSIGNED_SHORT, 0);
    else{
        if(repairedGL20 == null){
            repairedGL20 = new AndroidGL20();
        }
        repairedGL20.glDrawElements(GL_TRIANGLES, indexArray.length, GL_UNSIGNED_SHORT, 0);
    }

}

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);

If I understand correctly the main problem is call glDrawElements which is called every time I change MVP matrix. Is there any way of sending all MVP matrices to GPU and draw one element multiple times there with only 1 call to glDrawElements?

Some more info about object. It has around 24 vertices and texture 64x64. Currently with 1k objects I have 35FPS, I would like to get higher fps since I will be drawing some more sprites.

Here are my shaders: Vertex:

    uniform mat4 u_Matrix;

attribute vec4 a_Position;
attribute vec2 a_TextureCoordinates;

varying vec2 v_TextureCoordinates;

void main(){
    v_TextureCoordinates = a_TextureCoordinates;
    gl_Position = u_Matrix * a_Position;
}

Fragment:

precision mediump float;

uniform sampler2D u_TextureUnit;//actual texture data
varying vec2 v_TextureCoordinates;

void main(){
    gl_FragColor = texture2D(u_TextureUnit, v_TextureCoordinates);
}

One more thing I don't quite understand about textures. If I create texture something like this:

glBindTexture(GL_TEXTURE_2D, textureObjectIds[0]);//binds texture to texture object

        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR);//minimization filter
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);

        texImage2D(GL_TEXTURE_2D, 0, bitmap, 0);//send texture data to OpenGL to the CURRENTLY BOUND object

When I'll be drawing objects with this texture, where is texture saved, on CPU's memory or on GPU? Furthermore in example above where I'm drawing same sprite, is texture being sent to GPU each draw call? If this is so, is there any way to optimize this (something similar to VBO)?

解决方案

For example lets say I have 1 object with 2 vertices that I want to translate to 2 places with MVP. How can I make "gl_Position" to be drawn in 2 separate places at the same time?

That can never be done. What I was referring to in my initial comment is something like a vertex buffer full of N-many (probably 4) vertices per-sprite... if in addition to the position in your vertex buffer you include an additional field, you can transform the vertices using different matrices in your shader. You will always have to define N vertices per-object, but you can use a static vertex buffer to do this as I will explain below.

Say you had a sprite defined by the points (before transformation):

   Position
   <0.0,0.0>
   <1.0,0.0>
   <1.0,1.0>
   <0.0,1.0>

If your modelview matrix includes scaling information, you can use the same set of points over and over, by the way.

Now, if you want to extend this to draw 3 sprites in a single call, you could do this:

  Position  Idx
  <0.0,0.0> [0]
  <1.0,0.0> [0]
  <1.0,1.0> [0]
  <0.0,1.0> [0]

  <0.0,0.0> [1]
  <1.0,0.0> [1]
  <1.0,1.0> [1]
  <0.0,1.0> [1]

  <0.0,0.0> [2]
  <1.0,0.0> [2]
  <1.0,1.0> [2]
  <0.0,1.0> [2]

Basically I added a new field to the vertex buffer, which identifies the sprite each set of 4 points belongs to; use a GLubyte for best performance. You will notice that the same set of points repeats over and over. Implementing instancing this way effectively increases the storage requirements to O (N*V + 4N), where V is the size of the original vertex data structure for 4 points and N is the number of sprites.

You can go as far as to define a vertex buffer that contains enough points for 16 sprites, and then when you want to draw multiple sprites in a single call you will always use the same vertex buffer and use a subset of the total number of points. To draw 4 sprites using this vertex buffer, simply draw the first 16 out of the 64 points it contains total.

Now, this is only half of the process. You also need to setup your vertex shader to take an array of modelview matrix uniforms that define the transformation per-sprite.

Here is an example vertex shader that could be used to do this:

#version 100

uniform mat4    proj_mat;
uniform mat4    instanced_mv [16];

attribute vec4  vtx_pos;
attribute vec2  vtx_st;
attribute float vtx_sprite_idx; // This would be a uint in desktop GLSL

varying   vec2  tex_st;

void main (void) {
  gl_Position = proj_mat * instanced_mv [(int)vtx_sprite_idx] * vtx_pos;
  tex_st      = vtx_st;
}

There are a few of things to note here:

  1. GLES 2.0 does not support integer vertex attributes, so the index must be floating-point

    • Indexing uniform arrays must be done using integer expressions in vertex shaders, so a cast is necessary.

  2. The number of modelview uniforms is really the limiting factor to how many sprites you can instance at once

    • Vertex shaders are only required to support 128 4-component uniform variables (a mat4 counts as 4 4-component uniforms), so this means if your vertex shader had only modelview uniforms you could support a maximum array of 32 (128/4).

  3. Store the sprite index as a GLubyte in your vertex buffer, but make sure you do not enable floating-point normalization when you set up your Vertex Attrib. Pointer.

Last, this shader has not been tested. If you have trouble understanding anything here or run into issues implementing this, feel free to leave a comment.

这篇关于安卓的OpenGL 2.0雪碧优化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆