WebGPU Cloth Simulation 04 - WebGPU Fundamentals & WebGPU Shading Language (wgsl)

May 07, 2024

WebGPU Cloth Simulation 04 - WebGPU Fundamentals & WebGPU Shading Language (wgsl)

Parallel Processing in the Web Environment

1. Case Study: Z-Emotion Virtual Try-On

https://medium.com/z-emotion/refining-fashion-ecommerce-marketing-with-interactive-3d-6ebdc74ff367

Technology used:

- OpenGL - https://www.opengl.org/

- Three.js - for web-based 3D rendering

- GLTF - cloth export format

- AWS - for web-hosting

2. WebGPU

The WebGPU API enables web developers to use the underlying system's GPU (Graphics Processing Unit) to carry out high-performance computations and draw complex images that can be rendered in the browser.

WebGPU is the successor to WebGL, providing better compatibility with modern GPUs, support for general-purpose GPU computations, faster operations, and access to more advanced GPU features.

How does WebGPU work?

- WebGPU sees physical GPU hardware as GPUAdapters.

- It provides a connection to an adapter via GPUDevice, which manages resources,

- the device’s GPUQueues, which execute commands.

- GPUDevice may have its own memory with high-speed access to the processing units.

- GPUBuffer and GPUTexture are the physical resources backed by GPU memory.

- GPUCommandBuffer and GPURenderBundle are containers for user-recorded commands.

- GPUShaderModule contains shader code.

- The other resources, such as GPUSampler or GPUBindGroup, configure the way physical resources are used by the GPU.

- GPUs execute commands encoded in GPUCommandBuffers by feeding data through a pipeline, which is a mix of fixed-function and programmable stages.

- Programmable stages execute shaders, which are special programs designed to run on GPU hardware.

- Most of the state of a pipeline is defined by a GPURenderPipeline or a GPUComputePipeline object.

- The state not included in these pipeline objects is set during encoding with commands, such as beginRenderPass() or setBlendConstant().

References:

W3C Working Draft: https://www.w3.org/TR/webgpu/

API Documentation: https://developer.mozilla.org/en-US/docs/Web/API/WebGPU_API

WebGPU Samples: https://webgpu.github.io/webgpu-samples/

3. WebGPU Fundamentals

1. High-level Overview

At a certain level, WebGPU is a very simple system. All it does is run 3 types of functions on the GPU. Vertex Shaders, Fragment Shaders, Compute Shaders.

- A Vertex Shader computes vertices. The shader returns vertex positions. For every group of 3 vertices the vertex shader function returns, a triangle is drawn between those 3 positions

- A Fragment Shader computes colors. When a triangle is drawn, for each pixel to be drawn the GPU calls your fragment shader. The fragment shader then returns a color.

- A Compute Shader is more generic. It’s effectively just a function you call and say “execute this function N times”. The GPU passes the iteration number each time it calls your function so you can use that number to do something unique on each iteration.

- There is a Pipeline. It contains the vertex shader and fragment shader the GPU will run. You could also have a pipeline with a compute shader.

- The shaders reference resources (buffers, textures, samplers) indirectly through Bind Groups.

- The pipeline defines attributes that reference buffers indirectly through the internal state

- Attributes pull data out of buffers and feed the data into the vertex shader.

- The vertex shader may feed data into the fragment shader.

- The fragment shader writes to textures indirectly through the render pass description.

References:

WebGPU Fundementals: https://webgpufundamentals.org/webgpu/lessons/webgpu-fundamentals.html

2. WebGPU Basics

a. Uniforms

Uniforms in WebGPU are small amounts of data that remain constant across the entire draw or dispatch call. They are typically used to pass settings or parameters that are the same for every vertex or fragment processed, such as transformation matrices, lighting information, or global settings.
Uniforms can be set up using uniform buffers, which are a type of buffer that holds data which does not change often and is accessible in all shader stages.

   b. Attributes (vertex shaders only)

Attributes provide per shader iteration data. Instead of mapping over the input, we need to tell WebGPU about these inputs and how to get data out of them.
   - getAttributes uses offset, and stride to compute indices into the corresponding source buffer and pull out values.
   - The pulled out values are then sent to the shader. On each iteration attributes will be different.

c. Raw Buffers

Buffers are effectively arrays. Instead of the system pulling the values out of the buffers for us, we calculated our own indices into the bound buffers.
This is more flexible than attributes since we have random access to the arrays. But it's potentially slower for the same reason. For example, in order access is usually cache friendly. When we calculate our own indices the GPU has no idea which part of a buffer we're going to access until we actually try to access it.

d. Inter-Stage Variables ( fragment shaders only)

Inter-Stage Variables are outputs from a vertex shader to a fragment shader. A vertex shader outputs positions that are used to draw/rasterize points, lines, and triangles.

e. Storage Buffer

Storage buffers are more flexible compared to uniform buffers. They are used to store and retrieve large amounts of data that can be read and written by shaders. Unlike uniform buffers, storage buffers allow for read and write operations in shaders, making them suitable for data that changes frequently or data that needs to be shared between different GPU tasks.
These buffers are essential for more complex computations such as physics simulations or particle systems where shaders need to update the data stored in the buffers.

f. Vertex Buffer

Vertex buffers are used specifically to store vertex data. This includes positions, normals, texture coordinates, colors, and other per-vertex data needed for rendering. Each vertex buffer is bound to the GPU pipeline and accessed by the vertex shader to render geometry.
The layout and structure of a vertex buffer are defined by the vertex buffer layout in the GPU pipeline, specifying how vertices and their attributes are organized and how they should be interpreted by the vertex shader.

g. Textures

Textures most often represent a 2d image. A 2d image is just a 2d array of color values. Textures can be accessed by special hardware called a sampler. A sampler can read up to 16 different values in a texture and blend them together in a way that is useful for many common use cases.
Texture coordinates for sampled textures go from 0.0 to 1.0 across and down a texture regardless of the actual size of the texture.
textureSample samples a texture. The first parameter is the texture to sample, The 2nd parameter is the sampler to specify how to sample the texture. The 3rd is the texture coordinate for where to samples.

h. Limitations

   - A shader function can only reference its inputs (attributes, buffers, textures, uniforms, inter-stage variables).
   - A shader can not allocate memory.
   - A shader has to be careful if it references things it writes to, the thing it's generating values for.

3. Data Memory Layout

A byte is 8 bits, a 32 bit value takes 4 bytes and a 16 bit value takes 2 bytes.

- Example:

struct OurStruct {
velocity: f32,
acceleration: f32,
frameCount: u32,
};

A byte is 8 bits, a 32 bit value takes 4 bytes and a 16 bit value takes 2 bytes.

4. Typed Arrays

   - new Float32Array(12)
This makes a new ArrayBuffer, in this case of 12 * 4 bytes. It then creates the Float32Array to view it.

   - new Float32Array([ 4, 5, 6 ])
This makes a new ArrayBuffer, in this case of 3 * 4 bytes. It then create the Float32Array to view it. And it sets the initial values to 4, 5, 6.

   - new Float32Array(someArrayBuffer)
A new Float32Array view is made on an existing buffer.

- new Float32Array(someArrayBuffer, byteOffset)
This makes a new Float32Array on an existing buffer but starts the view at byteOffset

- new Float32Array(someArrayBuffer, byteOffset, length)
This makes a new Float32Array on an existing buffer. The view starts at byteOffset and is length units long. So if we passed 3 for length the view would be 3 float32 values long (12 bytes) of someArrayBuffer.

   - Typed Arrays Properties
       - length: number of units
       - byteLength: size in bytes
       - byteOffset: offset in the typedArray's ArrayBuffer
       - buffer: the ArrayBuffer this TypedArray is viewing

- Computing Offset and Sizes Library
WebGPU Utils: https://github.com/greggman/webgpu-utils

- Discard

discard is a WGSL statement that can be used in a fragment shader to discard the current fragment or in other words, to not draw a pixel.

5. WebGPU Copying Data

   - writeBuffer: writeBuffer copies data from a TypedArray or ArrayBuffer in JavaScript to a buffer. This is arguably the most straight forward way to get data into a buffer.

   - writeTexture: writeTexture copies data from a TypedArray or ArrayBuffer in JavaScript to a texture.

   - copyBufferToBuffer: copyBufferToBuffer copies data from one buffer to another.

   - copyBufferToTexture: copyBufferToTexture copies data from a buffer to a texture.

   - copyTextureToBuffer: copyTextureToBuffer copies data from a texture to a buffer.

   - copyTextureToTexture: copyTextureToTexture copies a portion of one texture to another.

6. Mapping Buffers

Mapping a buffer means making it available to read or write from JavaScript. A mappable buffer can not be used as any other type of buffer (like a Uniform buffer, vertex buffer, index buffer, storage buffer, etc...)

   - GPUBufferUsage.MAP_READ | GPU_BufferUsage.COPY_DST

This is a buffer you can use the copy commands above to copy data to from another buffer or a texture, then map it to read the values in JavaScript

   - GPUBufferUsage.MAP_WRITE | GPU_BufferUsage.COPY_SRC

This is a buffer you can map in JavaScript, you can then put data in it from JavaScript, and finally unmap it and use the and the copy commands above to copy its contents to another buffer or texture.

   - mapAsync returns a Promise. When the promis resoves the buffer is mappable.

   - Once mapped, the buffer is not usable by WebGPU until you call unmap.

7. Buffer Usages

- MAP_READ

The buffer can be mapped for reading. It may only be combine with COPY_DST.

- MAP_WRITE

The buffer can be mapped for writing. It may only be combined with COPY_SRC.

- COPY_SRC

The buffer can be used as the source of a copy operation.

- COPY_DST

The buffer can be used as the destination of a copy or write operation.

- INDEX

The buffer can be used as an index buffer.

- VERTEX

The buffer can be used as a vertex buffer.

- UNIFORM

The buffer can be used as a uniform buffer.

- STORAGE

The buffer can be used as a storage buffer.

- INDIRECT

The buffer can be used as to store indirect command arguments.

- QUERY_RESOLVE

The buffer can be used to capture query results.

8. Resource Interface

A resource is an object which provides access to data external to a shader stage, and which is not an override-declaration and not a shader stage input or output. Resources are shared by all invocations of the shader. There are four kinds of resources:

- Uniform buffers

- Storage buffers

- Texture resources

- Sampler resources

The resource interface of a shader is the set of module-scope resource variables statically accessed by functions in the shader stage. Each resource variable must be declared with both group and binding attributes. Together with the shader's stage, these identify the binding address of the resource on the shader's pipeline.

Two different resource variables in a shader must not have the same group and binding values, when considered as a pair.

4. WebGPU Shading Language (WGSL)

a. Plain types:

   - i32: a 32 bit signed integer
   - u32: a 32 bit unsigned integer
   - f32: a 32 bit floating point number
   - bool: a boolean value
   - f16: a 16 bit floating point number (this is an optional feature you need to check for and request)

b. Entry points:

   - @vertex
   - @fragment
   - @compute

c. Attributes:

- @location(number) is used to defined inputs and outs of shaders.
- vertex shader inputs: For a vertex shader, inputs are defined by the @location attributes of the entry point function of the vertex shader.
- fragment shader output: For fragment shaders, @location specifies which GPURenderPassDescriptor.colorAttachment to store the result in.

d. Work Group:

A workgroup is a small collection of threads, each thread runs in parallel. Workgroup sizes are defined in 3 dimensions but default to 1 so our @workgroup_size(1) is equivalent to @workgroup_size(1, 1, 1). If we then call pass.dispatchWorkgroups(4, 3, 2) we’re saying, execute a workgroup of 24 threads, 4 * 3 * 2 times (24) for a total of 576 threads.

   Build in workgroup variables:

   - local_invocation_id: The id of this thread within a workgroup
   - workgroup_id: The id of the workgroup.
   - global_invocation_id: A unique id for each thread.
   - num_workgroups: What's being passed to pass.dispatchWorkgroups
   - local_invocation_index: The id of this thread linearized

e. Uniforms Buffer:

global variables for shaders.

f. Storage Buffer:

   - Slower than Uniform Buffers
   - Much larger than Uniform Buffers
   - Can be read/write, Uniform buffers are read-only.

g. Vertex Buffer:

Vertex buffers are just like any other WebGPU buffer; they hold data. The difference is we don’t access them directly from the vertex shader. Instead, we tell WebGPU what kind of data is in the buffer and how it’s organized. It then pulls the data out of the buffer and provides it for us.

References:

WebGPU WGSL: https://webgpufundamentals.org/webgpu/lessons/webgpu-wgsl.html

Search This Blog

A deep dive into WebGPU vs. WebGL for real-time cloth simulation