WebGL 是如何工作的? [英] How WebGL works?

查看:23
本文介绍了WebGL 是如何工作的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望深入了解 WebGL 的工作原理.我想在大多数人不太关心的水平上获得知识,因为这些知识对普通的 WebGL 程序员没有必要有用.例如,整个渲染系统的每个部分(浏览器、图形驱动程序等)在获取屏幕上的图像时扮演什么角色?是否每个浏览器都必须创建一个 javascript/html 引擎/环境才能在浏览器中运行 WebGL?为什么 chrome 在兼容 WebGL 方面领先于其他所有人?

I'm looking for deep understanding of how WebGL works. I'm wanting to gain knowledge at a level that most people care less about, because the knowledge isn't necessary useful to the average WebGL programmer. For instance, what role does each part(browser, graphics driver, etc..) of the total rendering system play in getting an image on the screen? Does each browser have to create a javascript/html engine/environment in order to run WebGL in browser? Why is chrome a head of everyone else in terms of being WebGL compatible?

那么,有什么好的入门资源呢?对于我想要的东西,kronos 规范有点缺乏(从我浏览它几分钟后看到的).我主要想知道这是如何在浏览器中完成/实现的,以及您的系统还需要进行哪些更改才能使其成为可能.

So, what's some good resources to get started? The kronos specification is kind of lacking( from what I saw browsing it for a few minutes ) for what I'm wanting. I'm wanting mostly how is this accomplished/implemented in browsers and what else needs to change on your system to make it possible.

推荐答案

希望这篇小文章对你有所帮助.它概括了我所学到的关于 WebGL 和 3D 的大部分内容.顺便说一句,如果我有任何错误,请有人纠正我——因为我也在学习!

Hopefully this little write-up is helpful to you. It overviews a big chunk of what I've learned about WebGL and 3D in general. BTW, if I've gotten anything wrong, somebody please correct me -- because I'm still learning, too!

浏览器就是这样,一个网络浏览器.它所做的只是公开 WebGL API(通过 JavaScript),程序员可以用它来做其他事情.

The browser is just that, a Web browser. All it does is expose the WebGL API (via JavaScript), which the programmer does everything else with.

据我所知,WebGL API 本质上只是一组(浏览器提供的)JavaScript 函数,它们围绕着 OpenGL ES 规范.因此,如果您了解 OpenGL ES,则可以很快采用 WebGL.不过,不要将此与纯 OpenGL 混淆.ES"很重要.

As near as I can tell, the WebGL API is essentially just a set of (browser-supplied) JavaScript functions which wrap around the OpenGL ES specification. So if you know OpenGL ES, you can adopt WebGL pretty quickly. Don't confuse this with pure OpenGL, though. The "ES" is important.

WebGL 规范被刻意留得很低级,留下了很多东西从一个应用程序到下一个应用程序重新实现.这取决于社区为自动化编写框架,并由开发人员决定选择使用哪个框架(如果有).这并不完全困难推出自己的,但这确实意味着花费大量的开销重新发明轮子.(FWIW,我一直在开发自己的 WebGL有一段时间称为 Jax 的框架现在.)

The WebGL spec was intentionally left very low-level, leaving a lot to be re-implemented from one application to the next. It is up to the community to write frameworks for automation, and up to the developer to choose which framework to use (if any). It's not entirely difficult to roll your own, but it does mean a lot of overhead spent on reinventing the wheel. (FWIW, I've been working on my own WebGL framework called Jax for a while now.)

图形驱动程序提供实际运行代码的 OpenGL ES 实现.此时,它在机器硬件上运行,甚至低于 C 代码.虽然这首先使 WebGL 成为可能,但它也是一把双刃剑,因为 OpenGL ES 驱动程序中的错误(我已经注意到很多)会出现在您的 Web 应用程序中,而您不会除非您可以依靠您的用户群来提交连贯的错误报告,包括操作系统、视频硬件和驱动程序版本,否则您必须知道它.这是针对此类问题的调试过程的最终结果.

The graphics driver supplies the implementation of OpenGL ES that actually runs your code. At this point, it's running on the machine hardware, below even the C code. While this is what makes WebGL possible in the first place, it's also a double edged sword because bugs in the OpenGL ES driver (which I've noted quite a number of already) will show up in your Web application, and you won't necessarily know it unless you can count on your user base to file coherent bug reports including OS, video hardware and driver versions. Here's what the debug process for such issues ends up looking like.

在 Windows 上,WebGL API 和硬件之间存在一个额外的层:ANGLE 或 "几乎原生的图形层引擎".因为 Windows 上的 OpenGL ES 驱动程序通常很糟糕,所以 ANGLE 接收这些调用并将它们转换为 DirectX 9 调用.

On Windows, there's an extra layer which exists between the WebGL API and the hardware: ANGLE, or "Almost Native Graphics Layer Engine". Because the OpenGL ES drivers on Windows generally suck, ANGLE receives those calls and translates them into DirectX 9 calls instead.

既然您已经知道这些部分是如何组合在一起的,那么让我们看一下关于如何将所有部分组合在一起以生成 3D 图像的较低层次的解释.

Now that you know how the pieces come together, let's look at a lower level explanation of how everything comes together to produce a 3D image.

首先,JavaScript 代码从 HTML5 canvas 元素中获取 3D 上下文.然后它注册一组着色器,这些着色器是用 GLSL([Open] GL 着色语言)编写的,本质上类似于 C 代码.

First, the JavaScript code gets a 3D context from an HTML5 canvas element. Then it registers a set of shaders, which are written in GLSL ([Open] GL Shading Language) and essentially resemble C code.

流程的其余部分非常模块化.您需要使用在着色器中定义的制服和属性将顶点数据和您打算使用的任何其他信息(例如顶点颜色、纹理坐标等)向下传递到图形管道,但确切的布局和命名此信息很大程度上取决于开发人员.

The rest of the process is very modular. You need to get vertex data and any other information you intend to use (such as vertex colors, texture coordinates, and so forth) down to the graphics pipeline using uniforms and attributes which are defined in the shader, but the exact layout and naming of this information is very much up to the developer.

JavaScript 设置初始数据结构并将它们发送到 WebGL API,后者将它们发送到 ANGLE 或 OpenGL ES,最终将其发送到图形硬件.

JavaScript sets up the initial data structures and sends them to the WebGL API, which sends them to either ANGLE or OpenGL ES, which ultimately sends it off to the graphics hardware.

一旦信息可供着色器使用,着色器必须分两个阶段转换信息以生成 3D 对象.第一个阶段是顶点着色器,它设置网格坐标.(这个阶段完全在显卡上运行,在上面讨论的所有 API 之下.)大多数情况下,在顶点着色器上执行的过程看起来像这样:

Once the information is available to the shader, the shader must transform the information in 2 phases to produce 3D objects. The first phase is the vertex shader, which sets up the mesh coordinates. (This stage runs entirely on the video card, below all of the APIs discussed above.) Most usually, the process performed on the vertex shader looks something like this:

gl_Position = PROJECTION_MATRIX * VIEW_MATRIX * MODEL_MATRIX * VERTEX_POSITION

其中 VERTEX_POSITION 是一个 4D 向量(x、y、z 和 w,通常设置为 1);VIEW_MATRIX 是一个 4x4 矩阵,表示相机对世界的看法;MODEL_MATRIX 是一个 4x4 矩阵,它将对象空间坐标(即,在应用旋转或平移之前对象的局部坐标)转换为世界空间坐标;和 PROJECTION_MATRIX 代表相机的镜头.

where VERTEX_POSITION is a 4D vector (x, y, z, and w which is usually set to 1); VIEW_MATRIX is a 4x4 matrix representing the camera's view into the world; MODEL_MATRIX is a 4x4 matrix which transforms object-space coordinates (that is, coords local to the object before rotation or translation have been applied) into world-space coordinates; and PROJECTION_MATRIX which represents the camera's lens.

大多数情况下,VIEW_MATRIXMODEL_MATRIX 是预先计算的,并且称为 MODELVIEW_MATRIX.有时,所有 3 个都被预先计算为MODELVIEW_PROJECTION_MATRIX 或只是 MVP.这些通常是指作为优化,虽然我想找时间做一些基准测试.它是JavaScript 中的预计算可能实际上更慢,如果它是每一帧都完成,因为 JavaScript 本身并不是那么快.在在这种情况下,通过对GPU 可能比在 JavaScript 中的 CPU 上执行更快.我们可以当然希望未来的 JS 实现能够解决这个潜力只需更快地解决问题.

Most often, the VIEW_MATRIX and MODEL_MATRIX are precomputed and called MODELVIEW_MATRIX. Occasionally, all 3 are precomputed into MODELVIEW_PROJECTION_MATRIX or just MVP. These are generally meant as optimizations, though I'd like find time to do some benchmarks. It's possible that precomputing is actually slower in JavaScript if it's done every frame, because JavaScript itself isn't all that fast. In this case, the hardware acceleration afforded by doing the math on the GPU might well be faster than doing it on the CPU in JavaScript. We can of course hope that future JS implementations will resolve this potential gotcha by simply being faster.

剪辑坐标

当所有这些都被应用后,gl_Position 变量将具有一组在 [-1, 1] 范围内的 XYZ 坐标和一个 W 分量.这些被称为剪辑坐标.

Clip Coordinates

When all of these have been applied, the gl_Position variable will have a set of XYZ coordinates ranging within [-1, 1], and a W component. These are called clip coordinates.

值得注意的是,剪辑坐标是顶点着色器真正唯一的东西需要生产.您可以完全跳过矩阵变换上面执行,只要你产生一个剪辑坐标结果.(我什至有尝试用四元数替换矩阵;有效很好,但我放弃了这个项目,因为我没有得到我希望的性能改进.)

It's worth noting that clip coordinates is the only thing the vertex shader really needs to produce. You can completely skip the matrix transformations performed above, as long as you produce a clip coordinate result. (I have even experimented with swapping out matrices for quaternions; it worked just fine but I scrapped the project because I didn't get the performance improvements I'd hoped for.)

在您向 gl_Position 提供剪辑坐标后,WebGL 将结果除以 gl_Position.w 产生所谓的标准化设备坐标.从那里开始,将像素投影到屏幕上很简单,只需将屏幕尺寸乘以 1/2,然后再加上屏幕尺寸的 1/2.[1] 以下是一些剪辑坐标转换的示例在 800x600 显示器上转换为 2D 坐标:

After you supply clip coordinates to gl_Position WebGL divides the result by gl_Position.w producing what's called normalized device coordinates. From there, projecting a pixel onto the screen is a simple matter of multiplying by 1/2 the screen dimensions and then adding 1/2 the screen dimensions.[1] Here are some examples of clip coordinates translated into 2D coordinates on an 800x600 display:

clip = [0, 0]
x = (0 * 800/2) + 800/2 = 400
y = (0 * 600/2) + 600/2 = 300

clip = [0.5, 0.5]
x = (0.5 * 800/2) + 800/2 = 200 + 400 = 600
y = (0.5 * 600/2) + 600/2 = 150 + 300 = 450

clip = [-0.5, -0.25]
x = (-0.5  * 800/2) + 800/2 = -200 + 400 = 200
y = (-0.25 * 600/2) + 600/2 = -150 + 300 = 150

像素着色器

一旦确定了应该在哪里绘制像素,该像素就会被传递给像素着色器,由像素着色器选择像素的实际颜色.这可以通过多种方式完成,从简单地硬编码特定颜色到纹理查找到更高级的法线和视差映射(本质上是欺骗"纹理查找以产生不同效果的方法).

Pixel Shaders

Once it's been determined where a pixel should be drawn, the pixel is handed off to the pixel shader, which chooses the actual color the pixel will be. This can be done in a myriad of ways, ranging from simply hard-coding a specific color to texture lookups to more advanced normal and parallax mapping (which are essentially ways of "cheating" texture lookups to produce different effects).

现在,到目前为止,我们已经忽略了剪辑坐标的 Z 分量.这是怎么回事.当我们乘以投影矩阵时,第三个剪辑分量产生了一些数字.如果该数字大于 1.0 或小于 -1.0,则该数字超出了投影矩阵的视野范围,分别对应于矩阵 zFar 和 zNear 值.

Now, so far we've ignored the Z component of the clip coordinates. Here's how that works out. When we multiplied by the projection matrix, the third clip component resulted in some number. If that number is greater than 1.0 or less than -1.0, then the number is beyond the view range of the projection matrix, corresponding to the matrix zFar and zNear values, respectively.

因此,如果它不在 [-1, 1] 范围内,则它会被完全裁剪.如果它在那个范围内,那么Z值被缩放到0到1[2]并与深度缓冲区比较[3].深度缓冲区等于屏幕尺寸,因此如果使用 800x600 的投影,深度缓冲区为 800 像素宽和 600 像素高.我们已经有了像素的 X 和 Y 坐标,因此将它们插入深度缓冲区以获取当前存储的 Z 值.如果 Z 值大于新 Z 值,则新 Z 值更接近比之前绘制的任何值,并替换它[4].此时点亮相关像素是安全的(或者在 WebGL 的情况下,将像素绘制到画布上),并将 Z 值存储为新的深度值.

So if it's not in the range [-1, 1] then it's clipped entirely. If it is in that range, then the Z value is scaled to 0 to 1[2] and is compared to the depth buffer[3]. The depth buffer is equal to the screen dimensions, so that if a projection of 800x600 is used, the depth buffer is 800 pixels wide and 600 pixels high. We already have the pixel's X and Y coordinates, so they are plugged into the depth buffer to get the currently stored Z value. If the Z value is greater than the new Z value, then the new Z value is closer than whatever was previously drawn, and replaces it[4]. At this point it's safe to light up the pixel in question (or in the case of WebGL, draw the pixel to the canvas), and store the Z value as the new depth value.

如果 Z 值大于存储的深度值,则认为它在后面"已经绘制的任何内容,并且该像素被丢弃.

If the Z value is greater than the stored depth value, then it is deemed to be "behind" whatever has already been drawn, and the pixel is discarded.

[1]实际转换使用gl.viewport设置将标准化设备坐标转换为像素.

[1]The actual conversion uses the gl.viewport settings to convert from normalized device coordinates to pixels.

[2]它实际上被缩放到 gl.depthRange 设置.它们默认为 0 到 1.

[2]It's actually scaled to the gl.depthRange settings. They default 0 to 1.

[3]假设您有一个深度缓冲区并且您已使用 gl.enable(gl.DEPTH_TEST) 开启深度测试.

[4]您可以设置Z值与gl.depthFunc

这篇关于WebGL 是如何工作的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆