Lucky Robots Blog Open Roles

2.3 · Renderer

A deferred Vulkan renderer built on NVRHI. Three cooperating layers — a static Renderer facade, the SceneRenderer deferred pipeline, and the Renderer2D batcher — talk to the GPU through a single global render command queue that is drained either by a real worker thread (runtime) or by the main thread at end of frame (editor).

Module: Hazel/src/Hazel/Renderer/ Renderer.h SceneRenderer.h Renderer2D.h NVRHI Vulkan HLSL only no GLSL / SPIR-V / MSL

Three layers, one queue

The renderer is split into three deliberately small layers. Each one has a clear job and the layer above never reaches past it.

facade

Renderer

Static front door. Owns the global RenderCommandQueue, the default textures, the ShaderLibrary, and the frame index. This is what gameplay / editor / tools call.

Renderer.h Submit(lambda)
3D pipeline

SceneRenderer

Owns the deferred pipeline: pre-depth, four cascaded shadows + spot atlas, geometry, GTAO, SSR, composite, bloom chain, DOF, SMAA, jump flood, outline, grid. Materials and pipeline state are cached at Init().

SceneRenderer.h deferred
2D batcher

Renderer2D

Batched 2D: quads, lines, circles, MSDF text. Used for UI, debug draws, and 2D gameplay layers. Plays cleanly on top of the SceneRenderer's final image.

Renderer2D.h batched

The deferred pipeline

SceneRenderer drives a long but linear pass chain. Each pass reads from the previous one's outputs and writes into images that SceneRenderer::Init() allocated up front. Pipeline state objects (PSOs) are cached — they are never recreated per frame.

DEPTH / SHADOWS Pre-Depth opaque Z prepass Shadow Cascades × 4 directional CSM Spot Shadow Atlas per-spot tile GEOMETRY Geometry (G-buffer) albedo / normal / mr / emissive GTAO ground-truth AO GTAO Denoise spatial filter SSR screen-space refl Composite (lighting) deferred resolve + IBL shadow maps → composite POST · BLOOM / DOF Bloom Downsample chain half-res → thumbnail Bloom Upsample chain progressive blur DOF circle of confusion AA / EDGE SMAA · Edge luma edges SMAA · Blend blend weights SMAA · Resolve final AA Jump Flood selection distance field Edge Outline selection halo PRESENT Grid Overlay editor only Final Pass Image GetFinalPassImage() Renderer2D overlay UI / debug / 2D layers ImGui editor UI pass Swapchain Present vsync / flip Read order is top-to-bottom; arrows show in-lane data flow. Dashed arrows cross lanes (shadows → composite, composite → bloom).
The deferred pipeline as swimlanes: depth/shadow setup → geometry + lighting → bloom/DOF → AA + selection → present.

Pass-by-pass

PassReadsWritesNotes
Pre-Depthopaque meshesdepth bufferZ prepass — lets later passes early-out on overdraw.
Shadow Cascades (4)opaque meshes per cascade4 shadow mapsCascaded shadow maps for the directional light.
Spot Shadow Atlasopaque meshes per spottiled atlasOne atlas tile per shadow-casting spot light.
Geometryscene meshes + materialsG-buffer (albedo, normal, MR, emissive) + depthThe deferred fill pass — one draw per material batch.
GTAOdepth + normalAO bufferGround-truth ambient occlusion.
GTAO DenoiseAO bufferdenoised AOSpatial filter to clean the AO output.
SSRG-buffer + depth + previous framereflection bufferScreen-space reflections for glossy materials.
CompositeG-buffer + AO + SSR + shadow maps + IBLHDR scene colourThe deferred resolve: lighting + IBL + reflections combined into one HDR target.
Bloom DownsampleHDR scene colourmip chainSuccessive half-res passes building a thumbnail.
Bloom Upsamplemip chainbloom bufferProgressive upsample with blur for soft bloom.
DOFHDR colour + depthblurred colourCircle-of-confusion based depth of field.
SMAA · Edgecolour / lumaedge textureSubpixel morphological AA — edge detection.
SMAA · Blendedge textureblend weightsComputes per-pixel blend weights.
SMAA · Resolvecolour + blend weightsanti-aliased colourFinal SMAA combine.
Jump Floodselection maskdistance fieldJump-flooding algorithm used by selection outlines.
Edge Outlinedistance fieldoutline overlaySelected-entity halo — editor mostly.
Grid Overlaydepthfinal imageEditor grid — world-space lines on the ground plane.

Submit / queue mechanism

The whole renderer is built around Renderer::Submit. You call it with a lambda; the lambda is enqueued on the global RenderCommandQueue and executed later in a context that holds the NVRHI command list. This is how every pass in the pipeline above issues its draws.

Renderer::Submit([=]()
{
    nvrhi::CommandListHandle cmd = Renderer::GetCommandList();
    cmd->setGraphicsState(state);
    cmd->draw(args);
});

Who drains the queue depends on the threading policy:

PolicyUsed bySubmit boundaryDrained byWhen
MultiThreaded Runtime, headless Crosses thread — main → render Dedicated render thread While main thread builds frame N+1
SingleThreaded Editor Same thread — no boundary Main thread End of frame
The number-one AI pitfall

It is tempting to assume Renderer::Submit(lambda) always crosses a thread boundary — that the lambda runs "later, on the render thread", and that captured state must be valid then. In the editor it does not. Main and render are the same thread; the lambda runs at end-of-frame on the same call stack you submitted from. Code that "fixes" a non-existent race by deep-copying state into the lambda will silently work in MT and waste copies in ST — or worse, code that captures by reference will silently work in ST and tear in MT.

Treat the submit as "deferred but the deferral may be zero-latency on the same thread." See Threading § "Editor vs. runtime".

Triple-buffered frames

The renderer triple-buffers per-frame resources (uniform buffers, transient descriptors, etc.). With three slots in rotation, frame N can be built on the main thread while frame N−1 is being recorded by the render thread and frame N−2 is in flight on the GPU.

This is only meaningful when CoreThreadingPolicy == MultiThreaded. In single-threaded mode the queue is drained at end-of-frame, so frames overlap only on the GPU side — CPU-side, frames are strictly sequential. Code that depends on "the render thread is one frame behind" works in runtime; in the editor that gap is zero.

Frame flow at a glance

Main thread BeginScene SceneRenderer Submit draws SubmitMesh / Submit2D EndScene enqueue passes Submit(lambda) → queue global RenderCommandQueue Render thread (MT) / same thread, EOF (ST) Drain queue execute lambdas Build NVRHI cmd list PSO from cache Execute on GPU Vulkan queue submit Present swapchain (or capture)
A frame, top-to-bottom: main thread submits, render context drains and executes.

Shaders & materials

Shaders live in Resources/Shaders/. They are HLSL — full stop. The renderer compiles them to SPIR-V through DXC and feeds the result to NVRHI. There is exactly one shading language allowed in the source tree.

HLSL only

Do not commit raw GLSL, raw SPIR-V, MSL, or any other shading language. New shaders go under Resources/Shaders/ in HLSL and are registered with the ShaderLibrary.

Pipeline state objects (PSOs) are cached, not built per frame. SceneRenderer::Init() is the right place to materialise every pipeline you'll need; the per-frame path looks them up by handle.

Materials hang off the asset system — see 2.9 Asset System. Texture dependencies are tracked: when a texture reloads, dependent materials are notified via OnDependencyUpdated and pick up the new image without a manual reload.

Adding a new render pass

  1. Write the shader in HLSL under Resources/Shaders/.
  2. Register it with ShaderLibrary.
  3. In SceneRenderer::Init, create the cached nvrhi::GraphicsPipeline / ComputePipeline and the material that binds inputs.
  4. Allocate any intermediate images / framebuffers there too — never per-frame.
  5. Insert the pass execution in SceneRenderer::PreRender or between two existing passes in EndScene, depending on where it fits in the swimlane diagram above.
  6. Register shader dependencies via Renderer::RegisterShaderDependency so live-reload picks the new pass up.
PSOs are not free

If you find yourself constructing a graphics pipeline inside EndScene or any per-frame path, stop. Lift it into Init and reference the cached handle.

Extending

You want…Do this
A new full-screen post-process passHLSL shader + cached PSO in SceneRenderer::Init + insert in EndScene (see Adding a pass).
A new material typeAdd to the material system + ensure the geometry pass binds the new inputs. Hook into asset dependency tracking.
New 2D primitive (e.g. capsule, arc)Extend Renderer2D — new batch type + shader + flush path. Keep the batched-quad invariant.
Debug overlay drawn over the sceneDebugRenderer or Renderer2D on top of the final pass image — not as a new SceneRenderer pass unless it really needs the G-buffer.
Off-screen capture (telemetry, gRPC streaming)Tap SceneRenderer::GetFinalPassImage() — see gRPC / Cross-System for ViewportService and CameraService.

Pitfalls

Renderer::Submit is not a thread-boundary guarantee

See Submit / queue mechanism. In the editor, lambdas run on the main thread; in the runtime they cross to the render thread. Write code that is correct under both — capture by value, don't rely on "I'm definitely on a different thread now".

No raw GLSL / SPIR-V / MSL

HLSL only. New shaders that show up in any other language will be rejected at review.

Cache pipelines, allocate images at Init

NVRHI PSOs and intermediate images are constructed in SceneRenderer::Init. Per-frame allocation is a bug.

Don't bypass the queue

Calling NVRHI directly from gameplay or editor code skips the queue and the frame-index machinery, and breaks in MT immediately. Always go through Renderer::Submit.

Key types

TypeHeaderRole
RendererHazel/src/Hazel/Renderer/Renderer.hStatic facade. Owns global queue, default textures, ShaderLibrary, frame index.
SceneRendererRenderer/SceneRenderer.hDeferred 3D pipeline. Init builds cached PSOs / images; EndScene issues the pass chain.
Renderer2DRenderer/Renderer2D.hBatched 2D — quads, lines, circles, MSDF text.
RenderCommandQueueRenderer/Global lambda queue drained by render thread (MT) or main thread end-of-frame (ST).
ShaderLibraryRenderer/Compiled-shader cache. Live-reload aware via dependency tracking.
DebugRendererRenderer/Immediate-mode debug draws (lines, gizmos). Backed by the gRPC DebugService.Draw RPC for external tooling.
Material / PipelineRenderer/Material parameter sets and cached NVRHI pipeline state.