如何减少OpenGL/WebGL中的绘图调用 [英] How to reduce draw calls in OpenGL/WebGL

查看:83
本文介绍了如何减少OpenGL/WebGL中的绘图调用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我了解OpenGL/WebGL的性能时,我几乎听说过减少抽奖活动.所以我的问题是我仅使用4个顶点来绘制带纹理的四边形.这意味着我的vbo通常只包含4个顶点. 基本上

gl.bindBuffer(gl.ARRAY_BUFFER,vbo);
gl.uniformMatrix4fv(matrixLocation, false, modelMatrix);
gl.drawArrays(gl.TRIANGLE_FAN,0, vertices.length/3);

这是我看到的问题.在绘制之前,我更新了当前四边形的模型矩阵.例如,将其沿y轴移动5个单位.

所以我必须要做的:

gl.bindBuffer(gl.ARRAY_BUFFER,vbo);
gl.uniformMatrix4fv(matrixLocation, false, modelMatrix);
gl.drawArrays(gl.TRIANGLE_FAN, 0, vertices.length/3);

gl.uniformMatrix4fv(matrixLocation, false, anotherModelMatrix);
gl.drawArrays(gl.TRIANGLE_FAN,0, vertices.length/3);
....// repeat until all textures are rendered

我如何减少抽奖次数?甚至减少到只有一个平局.

解决方案

第一个问题是,这有关系吗?

如果您的收入少于1000,甚至是2000,则绘图调用可能无关紧要.易于使用比其他大多数解决方案都重要.

如果您确实需要大量的四边形,那么有很多解决方案.一种是将N个四边形放入单个缓冲区中. 参阅此演示文稿.然后将位置,旋转和缩放比例放到其他缓冲区或纹理中,然后计算着色器中的矩阵.

换句话说,对于带纹理的四边形,人们通常将顶点位置和texcoords放在这样排列的缓冲区中

p0, p1, p2, p3, p4, p5,   // buffer for positions for 1 quad
t0, t1, t2, t3, t4, t5,   // buffer for texcoord for 1 quad

相反,您会这样做

p0, p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, ...  // positions for N quads
t0, t1, t2, t3, t4, t5, t6, t7, t8, t9, t10, t11, ...  // texcoords for N quads

p0-p5只是单位四边形值,p6- p11是相同的值,p12-p17也是相同的值. t0-t5是单位texcoord值,t6- t11是相同的texcoord值.等

然后添加更多缓冲区.假设我们想要的只是世界位置和规模.所以我们再添加2个缓冲区

s0, s0, s0, s0, s0, s0, s1, s1, s1, s1, s1, s1, s2, ...  // scales for N quads
w0, w0, w0, w0, w0, w0, w1, w1, w1, w1, w1, w1, w2, ...  // world positions for N quads

请注意刻度如何重复6次,第一个四边形的每个顶点重复一次.然后对下一个四分之一重复6次,依此类推.与世界位置相同.这样一来,单个四边形的所有6个顶点都具有相同的世界位置和相同的比例尺.

现在在着色器中,我们可以使用像这样的

attribute vec3 position;
attribute vec2 texcoord;
attribute vec3 worldPosition;
attribute vec3 scale;

uniform mat4 view;    // inverse of camera
uniform mat4 camera;  // inverse of view
uniform mat4 projection;

varying vec2 v_texcoord;

void main() {
   // Assuming we want billboards (quads that always face the camera)
   vec3 localPosition = (camera * vec4(position * scale, 0)).xyz;

   // make quad points at the worldPosition
   vec3 worldPos = worldPosition + localPosition;

   gl_Position = projection * view * vec4(worldPos, 1);

   v_texcoord = texcoord; // pass on texcoord to fragment shader
}

现在,无论何时我们要设置四边形的位置,我们都需要在相应的缓冲区中设置6个世界位置(6个顶点中的每个顶点).

通常,您可以更新所有世界位置,然后打1个电话给gl.bufferData以上传所有位置.

这里有10万个四边形

 const vs = `
attribute vec3 position;
attribute vec2 texcoord;
attribute vec3 worldPosition;
attribute vec2 scale;

uniform mat4 view;    // inverse of camera
uniform mat4 camera;  // inverse of view
uniform mat4 projection;

varying vec2 v_texcoord;

void main() {
   // Assuming we want billboards (quads that always face the camera)
   vec3 localPosition = (camera * vec4(position * vec3(scale, 1), 0)).xyz;

   // make quad points at the worldPosition
   vec3 worldPos = worldPosition + localPosition;

   gl_Position = projection * view * vec4(worldPos, 1);

   v_texcoord = texcoord; // pass on texcoord to fragment shader
}
`;

const fs = `
precision mediump float;
varying vec2 v_texcoord;
uniform sampler2D texture;
void main() {
  gl_FragColor = texture2D(texture, v_texcoord);
}
`;

const m4 = twgl.m4;
const gl = document.querySelector("canvas").getContext("webgl");

// compiles and links shaders and looks up locations
const programInfo = twgl.createProgramInfo(gl, [vs, fs]);

const numQuads = 100000;
const positions = new Float32Array(numQuads * 6 * 2);
const texcoords = new Float32Array(numQuads * 6 * 2);
const worldPositions = new Float32Array(numQuads * 6 * 3);
const basePositions = new Float32Array(numQuads * 3); // for JS
const scales = new Float32Array(numQuads * 6 * 2);
const unitQuadPositions = [
   -.5, -.5, 
    .5, -.5,
   -.5,  .5,
   -.5,  .5,
    .5, -.5,
    .5,  .5,
];
const unitQuadTexcoords = [
    0, 0,
    1, 0,
    0, 1,
    0, 1,
    1, 0,
    1, 1,
];

for (var i = 0; i < numQuads; ++i) {
  const off3 = i * 6 * 3;
  const off2 = i * 6 * 2;
  
  positions.set(unitQuadPositions, off2);
  texcoords.set(unitQuadTexcoords, off2);
  const worldPos = [rand(-100, 100), rand(-100, 100), rand(-100, 100)];
  const scale = [rand(1, 2), rand(1, 2)];
  basePositions.set(worldPos, i * 3);
  for (var j = 0; j < 6; ++j) {
    worldPositions.set(worldPos, off3 + j * 3);
    scales.set(scale, off2 + j * 2);
  }
}

const tex = twgl.createTexture(gl, {
  src: "http://i.imgur.com/weklTat.gif",
  crossOrigin: "",
  flipY: true,
});

// calls gl.createBuffer, gl.bufferData
const bufferInfo = twgl.createBufferInfoFromArrays(gl, {
  position: { numComponents: 2, data: positions, },
  texcoord: { numComponents: 2, data: texcoords, },
  worldPosition: { numComponents: 3, data: worldPositions, },
  scale: { numComponents: 2, data: scales, },
});

function render(time) {
   time *= 0.001; // seconds
   
   twgl.resizeCanvasToDisplaySize(gl.canvas);
   
   gl.viewport(0, 0, gl.canvas.width, gl.canvas.height);
   gl.enable(gl.DEPTH_TEST);
   
   gl.useProgram(programInfo.program);
   
   // calls gl.bindBuffer, gl.enableVertexAttribArray, gl.vertexAttribPointer
   twgl.setBuffersAndAttributes(gl, programInfo, bufferInfo);
   
   const fov = Math.PI * .25;
   const aspect = gl.canvas.clientWidth / gl.canvas.clientHeight;
   const zNear = .1;
   const zFar = 200;
   const projection = m4.perspective(fov, aspect, zNear, zFar);
   
   const radius = 100;
   const tm = time * .1
   const eye = [Math.sin(tm) * radius, Math.sin(tm * .9) * radius, Math.cos(tm) * radius];
   const target = [0, 0, 0];
   const up = [0, 1, 0];
   const camera = m4.lookAt(eye, target, up);
   const view = m4.inverse(camera);
   
   // calls gl.uniformXXX
   twgl.setUniforms(programInfo, { 
     texture: tex,
     view: view,
     camera: camera,
     projection: projection,
   });
   
   // update all the worldPositions
   for (var i = 0; i < numQuads; ++i) {
     const src = i * 3;
     const dst = i * 6 * 3;
     for (var j = 0; j < 6; ++j) {
       const off = dst + j * 3;
       worldPositions[off + 0] = basePositions[src + 0] + Math.sin(time + i) * 10;
       worldPositions[off + 1] = basePositions[src + 1] + Math.cos(time + i) * 10;
       worldPositions[off + 2] = basePositions[src + 2];
     }
   }
   
   // upload them to the GPU
   gl.bindBuffer(gl.ARRAY_BUFFER, bufferInfo.attribs.worldPosition.buffer);
   gl.bufferData(gl.ARRAY_BUFFER, worldPositions, gl.DYNAMIC_DRAW);
   
   // calls gl.drawXXX
   twgl.drawBufferInfo(gl, bufferInfo);
   
   requestAnimationFrame(render);
}
requestAnimationFrame(render);

function rand(min, max) {
  if (max === undefined) {
     max = min;
     min = 0;
  }
  return Math.random() * (max - min) + min;
} 

 body { margin: 0; }
canvas { width: 100vw; height: 100vh; display: block; } 

 <script src="https://twgljs.org/dist/3.x/twgl-full.min.js"></script>
<canvas /> 

您可以使用 ANGLE_instance_arrays 扩展名.它的速度不如上面的技术快,但是非常接近.

您还可以通过将世界位置和缩放比例存储在纹理中来将数据量从6减少到1.在这种情况下,您将添加一个仅带有重复ID的额外缓冲区,而不是2个额外的缓冲区

// id buffer
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3 ....

id重复6次,对于每个四边形的6个顶点中的每个顶点重复一次.

然后使用该id来计算纹理坐标以查找世界的位置和比例.

attribute float id;
...

uniform sampler2D worldPositionTexture;  // texture with world positions
uniform vec2 textureSize;               // pass in the texture size

...

  // compute the texel that contains our world position
  vec2 texel = vec2(
     mod(id, textureSize.x),
     floor(id / textureSize.x));

  // compute the UV coordinate to access that texel
  vec2 uv = (texel + .5) / textureSize;

  vec3 worldPosition = texture2D(worldPositionTexture, uv).xyz;

现在,您需要将世界位置放置在纹理中,您可能希望使用浮点纹理来简化它.您可以进行类似的缩放操作,或者将它们分别存储在单独的纹理中或全部存储在相同的纹理中,从而适当地更改uv计算.

 const vs = `
attribute vec3 position;
attribute vec2 texcoord;
attribute float id;

uniform sampler2D worldPositionTexture;  
uniform sampler2D scaleTexture;          
uniform vec2 textureSize;  // texture are same size so only one size needed
uniform mat4 view;    // inverse of camera
uniform mat4 camera;  // inverse of view
uniform mat4 projection;

varying vec2 v_texcoord;

void main() {
  // compute the texel that contains our world position
  vec2 texel = vec2(
     mod(id, textureSize.x),
     floor(id / textureSize.x));

  // compute the UV coordinate to access that texel
  vec2 uv = (texel + .5) / textureSize;

  vec3 worldPosition = texture2D(worldPositionTexture, uv).xyz;
  vec2 scale = texture2D(scaleTexture, uv).xy;

  // Assuming we want billboards (quads that always face the camera)
  vec3 localPosition = (camera * vec4(position * vec3(scale, 1), 0)).xyz;

  // make quad points at the worldPosition
  vec3 worldPos = worldPosition + localPosition;

  gl_Position = projection * view * vec4(worldPos, 1);

  v_texcoord = texcoord; // pass on texcoord to fragment shader
}
`;

const fs = `
precision mediump float;
varying vec2 v_texcoord;
uniform sampler2D texture;
void main() {
  gl_FragColor = texture2D(texture, v_texcoord);
}
`;

const m4 = twgl.m4;
const gl = document.querySelector("canvas").getContext("webgl");
const ext = gl.getExtension("OES_texture_float");
if (!ext) {
  alert("Doh! requires OES_texture_float extension");
}
if (gl.getParameter(gl.MAX_VERTEX_TEXTURE_IMAGE_UNITS) < 2) {
  alert("Doh! need at least 2 vertex texture image units");
}

// compiles and links shaders and looks up locations
const programInfo = twgl.createProgramInfo(gl, [vs, fs]);

const numQuads = 50000;
const positions = new Float32Array(numQuads * 6 * 2);
const texcoords = new Float32Array(numQuads * 6 * 2);
const ids = new Float32Array(numQuads * 6);
const basePositions = new Float32Array(numQuads * 3); // for JS
// we need to pad these because textures have to rectangles
const size = roundUpToNearest(numQuads * 4, 1024 * 4)
const worldPositions = new Float32Array(size);
const scales = new Float32Array(size);
const unitQuadPositions = [
   -.5, -.5, 
    .5, -.5,
   -.5,  .5,
   -.5,  .5,
    .5, -.5,
    .5,  .5,
];
const unitQuadTexcoords = [
    0, 0,
    1, 0,
    0, 1,
    0, 1,
    1, 0,
    1, 1,
];

for (var i = 0; i < numQuads; ++i) {
  const off2 = i * 6 * 2;
  const off4 = i * 4;
  
  // you could even put these in a texture OR you can even generate
  // them inside the shader based on the id. See vertexshaderart.com for
  // examples of generating positions in the shader based on id
  positions.set(unitQuadPositions, off2);
  texcoords.set(unitQuadTexcoords, off2);
  ids.set([i, i, i, i, i, i], i * 6);

  const worldPos = [rand(-100, 100), rand(-100, 100), rand(-100, 100)];
  const scale = [rand(1, 2), rand(1, 2)];
  basePositions.set(worldPos, i * 3);
    
  for (var j = 0; j < 6; ++j) {  
    worldPositions.set(worldPos, off4 + j * 4);    
    scales.set(scale, off4 + j * 4);
  }
}

const tex = twgl.createTexture(gl, {
  src: "http://i.imgur.com/weklTat.gif",
  crossOrigin: "",
  flipY: true,
});

const worldPositionTex = twgl.createTexture(gl, {
  type: gl.FLOAT,
  src: worldPositions,
  width: 1024,
  minMag: gl.NEAREST,
  wrap: gl.CLAMP_TO_EDGE,
});

const scaleTex = twgl.createTexture(gl, {
  type: gl.FLOAT,
  src: scales,
  width: 1024,
  minMag: gl.NEAREST,
  wrap: gl.CLAMP_TO_EDGE,
});

// calls gl.createBuffer, gl.bufferData
const bufferInfo = twgl.createBufferInfoFromArrays(gl, {
  position: { numComponents: 2, data: positions, },
  texcoord: { numComponents: 2, data: texcoords, },
  id: { numComponents: 1, data: ids, },
});

function render(time) {
   time *= 0.001; // seconds
   
   twgl.resizeCanvasToDisplaySize(gl.canvas);
   
   gl.viewport(0, 0, gl.canvas.width, gl.canvas.height);
   gl.enable(gl.DEPTH_TEST);
   
   gl.useProgram(programInfo.program);
   
   // calls gl.bindBuffer, gl.enableVertexAttribArray, gl.vertexAttribPointer
   twgl.setBuffersAndAttributes(gl, programInfo, bufferInfo);
   
   const fov = Math.PI * .25;
   const aspect = gl.canvas.clientWidth / gl.canvas.clientHeight;
   const zNear = .1;
   const zFar = 200;
   const projection = m4.perspective(fov, aspect, zNear, zFar);
   
   const radius = 100;
   const tm = time * .1
   const eye = [Math.sin(tm) * radius, Math.sin(tm * .9) * radius, Math.cos(tm) * radius];
   const target = [0, 0, 0];
   const up = [0, 1, 0];
   const camera = m4.lookAt(eye, target, up);
   const view = m4.inverse(camera);
   
   // update all the worldPositions
   for (var i = 0; i < numQuads; ++i) {
     const src = i * 3;
     const dst = i * 3;
     worldPositions[dst + 0] = basePositions[src + 0] + Math.sin(time + i) * 10;
     worldPositions[dst + 1] = basePositions[src + 1] + Math.cos(time + i) * 10;
     worldPositions[dst + 2] = basePositions[src + 2];
   }
   
   // upload them to the GPU
   const width = 1024;
   const height = worldPositions.length / width / 4;
   gl.bindTexture(gl.TEXTURE_2D, worldPositionTex);
   gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA, width, height, 0, gl.RGBA, gl.FLOAT, worldPositions); 
   
   // calls gl.uniformXXX, gl.activeTeture, gl.bindTexture
   twgl.setUniforms(programInfo, { 
     texture: tex,
     scaleTexture: scaleTex,
     worldPositionTexture: worldPositionTex,
     textureSize: [width, height],
     view: view,
     camera: camera,
     projection: projection,
   });
   
   // calls gl.drawXXX
   twgl.drawBufferInfo(gl, bufferInfo);
   
   requestAnimationFrame(render);
}
requestAnimationFrame(render);

function rand(min, max) {
  if (max === undefined) {
     max = min;
     min = 0;
  }
  return Math.random() * (max - min) + min;
}

function roundUpToNearest(v, round) {
  return ((v + round - 1) / round | 0) * round;
} 

 body { margin: 0; }
canvas { width: 100vw; height: 100vh; display: block; } 

 <script src="https://twgljs.org/dist/3.x/twgl-full.min.js"></script>
<canvas /> 

请注意,至少在我的机器上通过纹理执行操作比通过缓冲区执行操作要慢,因此,尽管JavaScript的工作量较小(每个四分之一只能更新一个worldPosition),但GPU的工作量显然更多(至少在我的机器上机器).对于我来说,缓冲区版本以60fps的速度运行,每100k个四边形,而纹理版本以40fps的速度运行,对我来说为100k的四倍.我将其降低到50k,但这些数字当然适用于我的机器.其他机器也可以.

类似这样的技术将使您拥有更多的四边形,但这是以牺牲灵活性为代价的.您只能以您在着色器中提供的方式操作它们.例如,如果您希望能够从不同的原点(中心,左上角,右下角等)进行缩放,则需要添加另一条数据或设置位置.如果要旋转,则需要添加旋转数据等.

您甚至可以在每个四元组中传递完整的矩阵,但随后您将在每个四元组中上传16个浮点数.尽管它可能会更快,因为您在调用gl.uniformMatrix4fv时已经在执行此操作,但是您只需要执行两次调用,即gl.bufferDatagl.texImage2D上载新矩阵,然后使用gl.drawXXX绘制. >

另一个问题是您提到了纹理.如果每个方形使用不同的纹理,则需要弄清楚如何将它们转换为纹理图集(一个纹理中的所有图像),在这种情况下,UV坐标不会像上面那样重复.

When i read about performance in OpenGL/WebGL, i almost hear about reducing the draw calls. So my problem is that i am using only 4 vertices to draw a textured quad. This means generally my vbo contains only 4 vertices. Basically

gl.bindBuffer(gl.ARRAY_BUFFER,vbo);
gl.uniformMatrix4fv(matrixLocation, false, modelMatrix);
gl.drawArrays(gl.TRIANGLE_FAN,0, vertices.length/3);

And here comes the problem i see. Before drawing i update the modelmatrix of the current quad. For example to move it 5 units along the y axis.

So what i have to:

gl.bindBuffer(gl.ARRAY_BUFFER,vbo);
gl.uniformMatrix4fv(matrixLocation, false, modelMatrix);
gl.drawArrays(gl.TRIANGLE_FAN, 0, vertices.length/3);

gl.uniformMatrix4fv(matrixLocation, false, anotherModelMatrix);
gl.drawArrays(gl.TRIANGLE_FAN,0, vertices.length/3);
....// repeat until all textures are rendered

How is it possible for me to reduce the draw calls? Or even reduce it to only one draw call.

解决方案

The first question is, does it matter?

If you're making less than 1000, maybe even 2000, draw calls it probably doesn't matter. Being easy to use is more important than most other solutions.

If you really need lots of quads then there's a bunch of solutions. One is to put N quads into a single buffer. See this presentation. Then put position, rotation, and scale either into other buffers or into a texture and compute the matrices inside your shader.

In other words, for a textured quad people usually put vertex position and texcoords in buffers ordered like this

p0, p1, p2, p3, p4, p5,   // buffer for positions for 1 quad
t0, t1, t2, t3, t4, t5,   // buffer for texcoord for 1 quad

Instead you'd do this

p0, p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, ...  // positions for N quads
t0, t1, t2, t3, t4, t5, t6, t7, t8, t9, t10, t11, ...  // texcoords for N quads

p0 - p5 are just unit quad values, p6 - p11 are the same values, p12 - p17 are again the same values. t0 - t5 are unit texcoord values, t6 - t11 are the same texcoord values. etc.

Then you add more buffers. Let's imagine all we want is world position and scale. So we add 2 more buffers

s0, s0, s0, s0, s0, s0, s1, s1, s1, s1, s1, s1, s2, ...  // scales for N quads
w0, w0, w0, w0, w0, w0, w1, w1, w1, w1, w1, w1, w2, ...  // world positions for N quads

Notice how the scale repeats 6 times, once for each vertex of the first quad. Then it repeats again 6 times for the next quad, etc.. The same with world position. That's so all 6 vertices of a single quad share the same world position and same scale.

Now in the shader we can use those like this

attribute vec3 position;
attribute vec2 texcoord;
attribute vec3 worldPosition;
attribute vec3 scale;

uniform mat4 view;    // inverse of camera
uniform mat4 camera;  // inverse of view
uniform mat4 projection;

varying vec2 v_texcoord;

void main() {
   // Assuming we want billboards (quads that always face the camera)
   vec3 localPosition = (camera * vec4(position * scale, 0)).xyz;

   // make quad points at the worldPosition
   vec3 worldPos = worldPosition + localPosition;

   gl_Position = projection * view * vec4(worldPos, 1);

   v_texcoord = texcoord; // pass on texcoord to fragment shader
}

Now the anytime we want to set the position of a quad we need to set the 6 world positions (one for each of the 6 vertices) in the corresponding buffer.

Generally you can update all the world positions, then make 1 call to gl.bufferData to upload all of them.

Here's 100k quads

const vs = `
attribute vec3 position;
attribute vec2 texcoord;
attribute vec3 worldPosition;
attribute vec2 scale;

uniform mat4 view;    // inverse of camera
uniform mat4 camera;  // inverse of view
uniform mat4 projection;

varying vec2 v_texcoord;

void main() {
   // Assuming we want billboards (quads that always face the camera)
   vec3 localPosition = (camera * vec4(position * vec3(scale, 1), 0)).xyz;

   // make quad points at the worldPosition
   vec3 worldPos = worldPosition + localPosition;

   gl_Position = projection * view * vec4(worldPos, 1);

   v_texcoord = texcoord; // pass on texcoord to fragment shader
}
`;

const fs = `
precision mediump float;
varying vec2 v_texcoord;
uniform sampler2D texture;
void main() {
  gl_FragColor = texture2D(texture, v_texcoord);
}
`;

const m4 = twgl.m4;
const gl = document.querySelector("canvas").getContext("webgl");

// compiles and links shaders and looks up locations
const programInfo = twgl.createProgramInfo(gl, [vs, fs]);

const numQuads = 100000;
const positions = new Float32Array(numQuads * 6 * 2);
const texcoords = new Float32Array(numQuads * 6 * 2);
const worldPositions = new Float32Array(numQuads * 6 * 3);
const basePositions = new Float32Array(numQuads * 3); // for JS
const scales = new Float32Array(numQuads * 6 * 2);
const unitQuadPositions = [
   -.5, -.5, 
    .5, -.5,
   -.5,  .5,
   -.5,  .5,
    .5, -.5,
    .5,  .5,
];
const unitQuadTexcoords = [
    0, 0,
    1, 0,
    0, 1,
    0, 1,
    1, 0,
    1, 1,
];

for (var i = 0; i < numQuads; ++i) {
  const off3 = i * 6 * 3;
  const off2 = i * 6 * 2;
  
  positions.set(unitQuadPositions, off2);
  texcoords.set(unitQuadTexcoords, off2);
  const worldPos = [rand(-100, 100), rand(-100, 100), rand(-100, 100)];
  const scale = [rand(1, 2), rand(1, 2)];
  basePositions.set(worldPos, i * 3);
  for (var j = 0; j < 6; ++j) {
    worldPositions.set(worldPos, off3 + j * 3);
    scales.set(scale, off2 + j * 2);
  }
}

const tex = twgl.createTexture(gl, {
  src: "http://i.imgur.com/weklTat.gif",
  crossOrigin: "",
  flipY: true,
});

// calls gl.createBuffer, gl.bufferData
const bufferInfo = twgl.createBufferInfoFromArrays(gl, {
  position: { numComponents: 2, data: positions, },
  texcoord: { numComponents: 2, data: texcoords, },
  worldPosition: { numComponents: 3, data: worldPositions, },
  scale: { numComponents: 2, data: scales, },
});

function render(time) {
   time *= 0.001; // seconds
   
   twgl.resizeCanvasToDisplaySize(gl.canvas);
   
   gl.viewport(0, 0, gl.canvas.width, gl.canvas.height);
   gl.enable(gl.DEPTH_TEST);
   
   gl.useProgram(programInfo.program);
   
   // calls gl.bindBuffer, gl.enableVertexAttribArray, gl.vertexAttribPointer
   twgl.setBuffersAndAttributes(gl, programInfo, bufferInfo);
   
   const fov = Math.PI * .25;
   const aspect = gl.canvas.clientWidth / gl.canvas.clientHeight;
   const zNear = .1;
   const zFar = 200;
   const projection = m4.perspective(fov, aspect, zNear, zFar);
   
   const radius = 100;
   const tm = time * .1
   const eye = [Math.sin(tm) * radius, Math.sin(tm * .9) * radius, Math.cos(tm) * radius];
   const target = [0, 0, 0];
   const up = [0, 1, 0];
   const camera = m4.lookAt(eye, target, up);
   const view = m4.inverse(camera);
   
   // calls gl.uniformXXX
   twgl.setUniforms(programInfo, { 
     texture: tex,
     view: view,
     camera: camera,
     projection: projection,
   });
   
   // update all the worldPositions
   for (var i = 0; i < numQuads; ++i) {
     const src = i * 3;
     const dst = i * 6 * 3;
     for (var j = 0; j < 6; ++j) {
       const off = dst + j * 3;
       worldPositions[off + 0] = basePositions[src + 0] + Math.sin(time + i) * 10;
       worldPositions[off + 1] = basePositions[src + 1] + Math.cos(time + i) * 10;
       worldPositions[off + 2] = basePositions[src + 2];
     }
   }
   
   // upload them to the GPU
   gl.bindBuffer(gl.ARRAY_BUFFER, bufferInfo.attribs.worldPosition.buffer);
   gl.bufferData(gl.ARRAY_BUFFER, worldPositions, gl.DYNAMIC_DRAW);
   
   // calls gl.drawXXX
   twgl.drawBufferInfo(gl, bufferInfo);
   
   requestAnimationFrame(render);
}
requestAnimationFrame(render);

function rand(min, max) {
  if (max === undefined) {
     max = min;
     min = 0;
  }
  return Math.random() * (max - min) + min;
}

body { margin: 0; }
canvas { width: 100vw; height: 100vh; display: block; }

<script src="https://twgljs.org/dist/3.x/twgl-full.min.js"></script>
<canvas />

You can reduce the number of repeated vertices from 6 to 1 by using the ANGLE_instance_arrays extension. It's not quite as fast as the technique above but it's pretty close.

You can also reduce the amount of data from 6 to 1 by storing the world positions and scale in a texture. In that case instead of the 2 extra buffers you add one extra buffer with just a repeated id

// id buffer
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3 ....

The id repeats 6 times, once for each of the 6 vertices of each quad.

You then use that id to compute a texture coordinate to lookup world position and scale.

attribute float id;
...

uniform sampler2D worldPositionTexture;  // texture with world positions
uniform vec2 textureSize;               // pass in the texture size

...

  // compute the texel that contains our world position
  vec2 texel = vec2(
     mod(id, textureSize.x),
     floor(id / textureSize.x));

  // compute the UV coordinate to access that texel
  vec2 uv = (texel + .5) / textureSize;

  vec3 worldPosition = texture2D(worldPositionTexture, uv).xyz;

Now you need to put your world positions in a texture, you probably want a floating point texture to make it easy. You can do similar things for scale etc and either store each in a separate texture or all in the same texture changing your uv calculation appropriately.

const vs = `
attribute vec3 position;
attribute vec2 texcoord;
attribute float id;

uniform sampler2D worldPositionTexture;  
uniform sampler2D scaleTexture;          
uniform vec2 textureSize;  // texture are same size so only one size needed
uniform mat4 view;    // inverse of camera
uniform mat4 camera;  // inverse of view
uniform mat4 projection;

varying vec2 v_texcoord;

void main() {
  // compute the texel that contains our world position
  vec2 texel = vec2(
     mod(id, textureSize.x),
     floor(id / textureSize.x));

  // compute the UV coordinate to access that texel
  vec2 uv = (texel + .5) / textureSize;

  vec3 worldPosition = texture2D(worldPositionTexture, uv).xyz;
  vec2 scale = texture2D(scaleTexture, uv).xy;

  // Assuming we want billboards (quads that always face the camera)
  vec3 localPosition = (camera * vec4(position * vec3(scale, 1), 0)).xyz;

  // make quad points at the worldPosition
  vec3 worldPos = worldPosition + localPosition;

  gl_Position = projection * view * vec4(worldPos, 1);

  v_texcoord = texcoord; // pass on texcoord to fragment shader
}
`;

const fs = `
precision mediump float;
varying vec2 v_texcoord;
uniform sampler2D texture;
void main() {
  gl_FragColor = texture2D(texture, v_texcoord);
}
`;

const m4 = twgl.m4;
const gl = document.querySelector("canvas").getContext("webgl");
const ext = gl.getExtension("OES_texture_float");
if (!ext) {
  alert("Doh! requires OES_texture_float extension");
}
if (gl.getParameter(gl.MAX_VERTEX_TEXTURE_IMAGE_UNITS) < 2) {
  alert("Doh! need at least 2 vertex texture image units");
}

// compiles and links shaders and looks up locations
const programInfo = twgl.createProgramInfo(gl, [vs, fs]);

const numQuads = 50000;
const positions = new Float32Array(numQuads * 6 * 2);
const texcoords = new Float32Array(numQuads * 6 * 2);
const ids = new Float32Array(numQuads * 6);
const basePositions = new Float32Array(numQuads * 3); // for JS
// we need to pad these because textures have to rectangles
const size = roundUpToNearest(numQuads * 4, 1024 * 4)
const worldPositions = new Float32Array(size);
const scales = new Float32Array(size);
const unitQuadPositions = [
   -.5, -.5, 
    .5, -.5,
   -.5,  .5,
   -.5,  .5,
    .5, -.5,
    .5,  .5,
];
const unitQuadTexcoords = [
    0, 0,
    1, 0,
    0, 1,
    0, 1,
    1, 0,
    1, 1,
];

for (var i = 0; i < numQuads; ++i) {
  const off2 = i * 6 * 2;
  const off4 = i * 4;
  
  // you could even put these in a texture OR you can even generate
  // them inside the shader based on the id. See vertexshaderart.com for
  // examples of generating positions in the shader based on id
  positions.set(unitQuadPositions, off2);
  texcoords.set(unitQuadTexcoords, off2);
  ids.set([i, i, i, i, i, i], i * 6);

  const worldPos = [rand(-100, 100), rand(-100, 100), rand(-100, 100)];
  const scale = [rand(1, 2), rand(1, 2)];
  basePositions.set(worldPos, i * 3);
    
  for (var j = 0; j < 6; ++j) {  
    worldPositions.set(worldPos, off4 + j * 4);    
    scales.set(scale, off4 + j * 4);
  }
}

const tex = twgl.createTexture(gl, {
  src: "http://i.imgur.com/weklTat.gif",
  crossOrigin: "",
  flipY: true,
});

const worldPositionTex = twgl.createTexture(gl, {
  type: gl.FLOAT,
  src: worldPositions,
  width: 1024,
  minMag: gl.NEAREST,
  wrap: gl.CLAMP_TO_EDGE,
});

const scaleTex = twgl.createTexture(gl, {
  type: gl.FLOAT,
  src: scales,
  width: 1024,
  minMag: gl.NEAREST,
  wrap: gl.CLAMP_TO_EDGE,
});

// calls gl.createBuffer, gl.bufferData
const bufferInfo = twgl.createBufferInfoFromArrays(gl, {
  position: { numComponents: 2, data: positions, },
  texcoord: { numComponents: 2, data: texcoords, },
  id: { numComponents: 1, data: ids, },
});

function render(time) {
   time *= 0.001; // seconds
   
   twgl.resizeCanvasToDisplaySize(gl.canvas);
   
   gl.viewport(0, 0, gl.canvas.width, gl.canvas.height);
   gl.enable(gl.DEPTH_TEST);
   
   gl.useProgram(programInfo.program);
   
   // calls gl.bindBuffer, gl.enableVertexAttribArray, gl.vertexAttribPointer
   twgl.setBuffersAndAttributes(gl, programInfo, bufferInfo);
   
   const fov = Math.PI * .25;
   const aspect = gl.canvas.clientWidth / gl.canvas.clientHeight;
   const zNear = .1;
   const zFar = 200;
   const projection = m4.perspective(fov, aspect, zNear, zFar);
   
   const radius = 100;
   const tm = time * .1
   const eye = [Math.sin(tm) * radius, Math.sin(tm * .9) * radius, Math.cos(tm) * radius];
   const target = [0, 0, 0];
   const up = [0, 1, 0];
   const camera = m4.lookAt(eye, target, up);
   const view = m4.inverse(camera);
   
   // update all the worldPositions
   for (var i = 0; i < numQuads; ++i) {
     const src = i * 3;
     const dst = i * 3;
     worldPositions[dst + 0] = basePositions[src + 0] + Math.sin(time + i) * 10;
     worldPositions[dst + 1] = basePositions[src + 1] + Math.cos(time + i) * 10;
     worldPositions[dst + 2] = basePositions[src + 2];
   }
   
   // upload them to the GPU
   const width = 1024;
   const height = worldPositions.length / width / 4;
   gl.bindTexture(gl.TEXTURE_2D, worldPositionTex);
   gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA, width, height, 0, gl.RGBA, gl.FLOAT, worldPositions); 
   
   // calls gl.uniformXXX, gl.activeTeture, gl.bindTexture
   twgl.setUniforms(programInfo, { 
     texture: tex,
     scaleTexture: scaleTex,
     worldPositionTexture: worldPositionTex,
     textureSize: [width, height],
     view: view,
     camera: camera,
     projection: projection,
   });
   
   // calls gl.drawXXX
   twgl.drawBufferInfo(gl, bufferInfo);
   
   requestAnimationFrame(render);
}
requestAnimationFrame(render);

function rand(min, max) {
  if (max === undefined) {
     max = min;
     min = 0;
  }
  return Math.random() * (max - min) + min;
}

function roundUpToNearest(v, round) {
  return ((v + round - 1) / round | 0) * round;
}

body { margin: 0; }
canvas { width: 100vw; height: 100vh; display: block; }

<script src="https://twgljs.org/dist/3.x/twgl-full.min.js"></script>
<canvas />

Note that at least on my machine doing it through a texture is slower than doing it through buffers so while it's less work for JavaScript (only one worldPosition to update per quad) it's apparently more work for the GPU (at least on my machine). The buffer version runs at 60fps for me with 100k quads whereas the texture version ran at about 40fps with 100k quads. I lowered it to 50k but of course those numbers are for my machine. Other machines will very.

Techniques like this will allow you to have way more quads but it comes at the expense of flexibility. You can only manipulate them in ways you provided in your shader. For example if you want to be able to scale from different origins (center, top-left, bottom-right, etc) you'd need to add yet another piece of data or set the positions. If you wanted to rotate you'd need to add rotation data, etc...

You could even pass in whole matrices per quad but then you'd be uploading 16 floats per quad. It still might be faster though since you're already doing that when calling gl.uniformMatrix4fv but you'd be doing just 2 calls, gl.bufferData or gl.texImage2D to upload the new matrices and then gl.drawXXX to draw.

Yet another issue is you mentioned textures. If you're using a different texture per quad then you need to figure out how to convert them to a texture atlas (all the images in one texture) in which case your UV coordinates would not repeat as they do above.

这篇关于如何减少OpenGL/WebGL中的绘图调用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆