TGDF 2024 Unreal Lumen with Arm Immortalis : The Best Practices of Ray Tracing Content on Mobile

OwenWu 1,921 views 48 slides Jul 12, 2024
Slide 1
Slide 1 of 48
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48

About This Presentation

How to enable and optimize Unreal Lumen on Immortalis based mobile device.


Slide Content

© 2024 Arm
Owen Wu
Principal Developer Relation Engineer
2024.07.11
Unreal Lumen with Arm Immortalis
The Best Practices of Ray Tracing Content on Mobile
TGDF 2024

2© 2024 Arm
Agenda
●Steel Arms on Immortalis
●Ray tracing basics
●Vulkan ray query
●Global Illumination
●Lumen lighting pipeline
●Lumen best practices

3© 2024 Arm
Steel Arms on
Immortalis

4© 2024 Arm
Arm’s Most Efficient GPUs Ever
All improvements are compared to the same configuration of Immortalis-G715,
implemented on the same silicon process
System-level
Efficiency
Up to 40%
less memory bandwidth usage
GPU Efficiency
Average 15%
more performance per Watt
Highest
Performance
Average 15%
more peak performance
HDR Rendering
Architectural throughput
for 64bpp Texturing
2x
Hardware
Ray Tracing

6© 2024 Arm
Steel Arms is the latestImmortalisbased demo from Arm to pioneer the new frontier
of next-gen graphics technology.
Created with Unreal Engine 5.3, the demo brings desktop level Bloom, Motion Blur
and DOF effects, alongside PBR to smartphone. With the power ofImmortalis, Steel
Arms unleashes the full potential of Ray Tracing for shadows andLumen, opening a
new era of mobile graphics beyond rasterization.
Steel Arms on Immortalis

7© 2024 Arm

8© 2024 Arm

9© 2024 Arm
Ray Tracing Basics

10© 2024 Arm
●Object by object
●Triangles projected onto screen
●Check pixel coverage
●Use Z-Buffer for visibility
Rasterization

11© 2024 Arm
●Pixel by pixel
●Cast a ray from camera to pixel
●Check triangle intersection
●Use closest-hit for visibility
●More rays for more complex
rendering
Ray Tracing

12© 2024 Arm
Vulkan Ray Query

13© 2024 Arm
●Ray queries can be used to perform
ray traversal and get a result back in
any shader stage
●Other than requiring acceleration
structures, ray queries are
performed using only a set of new
shader instructions
Vulkan Ray Query

14© 2024 Arm
●Optimised data structure
●Minimises intersection tests
●Quickly find what a ray has hit
●User can control the topology
●Bottom Level (BLAS)
●Contain index and vertex data
●Hierarchical bounding volumes
●Top Level (TLAS)
●BLAS grouped in instances with
●Transform data (animations)
●Custom ID (materials)
Acceleration Structure
TLAS
Instance Instance Instance
BLAS BLAS BLAS
Instance

15© 2024 Arm
GLSL Sample
Ray queries are initialized with an
acceleration structure to query against,
ray flags determining properties of the
traversal, a cull mask, and a geometric
description of the ray being traced.
rayQueryEXTrq;
rayQueryInitializeEXT(rq, accStruct,
gl_RayFlagsTerminateOnFirstHitEXT |
gl_RayFlagsOpaqueEXT,
cullMask,
origin, tMin, direction, tMax);
// Traverse the acceleration structure
rayQueryProceedEXT(rq);
// Check intersections (if any)
if(rayQueryGetIntersectionTypeEXT(rq, true)!=
gl_RayQueryCommittedIntersectionNoneEXT)
{
// In shadow
}

16© 2024 Arm
Global
Illumination

17© 2024 Arm
●Take whole scene into consideration
●Light can bounce between surfaces
●Direct lighting
●Indirect lighting
●Indirect shadow
●Colorbleeding
Global Illumination
Unreal Lumen Generated Cornell Box
Direct
Lighting
Direct
Lighting
Indirect
Lighting
Indirect
Shadow
Color
Bleeding
Color
Bleeding

18© 2024 Arm
Lumen Lighting
Pipeline

19© 2024 Arm
Lumen Overview
●Lumen is a dynamic global illumination system
●Emissive objects can be light source
●Direct lighting + Indirect lighting + Reflection
●Pipeline overview
●Generate Lumen sceneto present a coarse scene
●Lighting Lumen scene
●Use feedback buffer to simulate multi-bounce
●Generate probesto dynamically gather indirect lighting of the scene
●Compute indirect lighting of every pixel from nearby probes
●Compute reflections of every pixel from Lumen scene
●Computedirect lighting of every pixel using regular shading
●Combine all together

20© 2024 Arm
Lumen Lighting Pipeline
●Update surface cache
●Lumen scene lighting
●Direct lighting
●Indirect lighting trace
●Generate final lighting
●Lumen screen probe gather
●Place screen space probes
●Probe trace
●Screen trace
●Near lighting trace
●Distant lighting trace
●Lumen reflection trace
●Direct lighting
Update
Surface Cache
Lumen Scene
Lighting
Last Frame
Radiance
Cache
Screen Probe
Gather
Reflection
Trace

21© 2024 Arm
Lumen Scene
●Lumen scene is a simplified scene description
●Use 2 data to descript the scene
●Signed Distance Field
●Only be used when doing software ray tracing
●Hardware ray tracing will use acceleration structure
instead
●Surface Cache
●Used to cache material data
●Quick sample the material data when ray hit
●For both software and hardware ray tracing

22© 2024 Arm
Normal Scene Lumen Scene

23© 2024 Arm
Ray tracing in the pipeline
●Lumen scene indirect lighting (No screen trace)
●Screen probe gather
●Lumen reflection
Screen Tracing
Software Ray
Tracing
Hardware Ray
Tracing
Skylight

24© 2024 Arm
Screen Trace OFF Screen Trace ON

25© 2024 Arm
Lumen Scene Lighting
●Direct Lighting
●Tiled deferred shading in surface space
●Each tile can control the max number of light sources
●Can control lighting update rate
●Indirect Lighting from radiosity
●Use last frame cache data as radiosity source
●Place hemispherical probes on top of surface to gather radiosity
●Can control the number of probes and gather rays
●Finally store direct and indirect lighting in final lighting atlas

26© 2024 Arm
Lumen Screen Probe Gather
●Place probes on pixels using the GBuffer
●Adaptive downsampling
●Trace from probes and sample radiance cache atlas to generate screen space radiance cache
●Screen probe only trace to 2.0 meters
●Place world probes around screen probes to gather distant lighting
16 48
Source: SIGGRAPH 2022 -Lumen: Real-time Global Illumination in Unreal Engine 5​

27© 2024 Arm
Lumen Screen Probe Gather
Screen Trace + Near Lighting Trace + Distant Lighting Trace
= Screen Space Radiance

28© 2024 Arm
Lumen Reflection Trace
When roughness is higher than MaxRoughnessToTracethen reuse the screen
space radiance cache
When roughness is lower than MaxRoughnessToTracethen do the extra ray
tracing
Same tracing pipeline of screen probe tracing
MaxRoughnessToTracecan be customized
Decrease MaxRoughnessToTrace to reflection reduce ray tracing cost

29© 2024 Arm
Lumen Reflection Trace
Screen Trace + Near Lighting Trace + Distant Lighting Trace
= Reflection Radiance

30© 2024 Arm
Lumen Final Lighting
Indirect Diffuse + Indirect Specular + Reflection
= Indirect Lighting+ Direct Lighting

31© 2024 Arm
Lumen Best
Practices

32© 2024 Arm
How To Enable Hardware Ray Tracing on Mobile
●Enable SM5 shader format
●r.Android.DisableVulkanSM5Support=0
●Enable deferred shading mode
●EnableSupport Hardware Ray Tracing
●EnableUse Hardware Ray Tracing when available
●r.RayTracing.AllowInline=1

33© 2024 Arm
BVH optimization
●Exclude the objects which are not contributing to lighting from ray tracing
●Reduce the overlapof meshes
●Use instanced static mesh to reduce the memory usage of BLAS
●Skinned mesh needs update BLAS at run-time
●Use higher LOD level of skinned mesh for ray tracing
●May cause artifact when using hardware ray tracing shadow

34© 2024 Arm
BVH Overlapping Optimized BVH

35© 2024 Arm
Ray Tracing Min LOD Ray Tracing Shadow Artifact

36© 2024 Arm
Ray Query Shader Optimization
FLumenMinimalRayResultTraceLumenMinimalRay(
inRaytracingAccelerationStructure TLAS,
FRayDescRay,
inoutFRayTracedLightingContextContext)
{
FLumenMinimalPayloadPayload=
(FLumenMinimalPayload)0;
FLumenMinimalRayResultMinimalRayResult=
InitLumenMinimalRayResult();
//uintRayFlags=
RAY_FLAG_FORCE_NON_OPAQUE; // Run any-
hit shader
uintRayFlags=0;
In shader
LumenHardwareRayTracingCommon.ush
, the ray query flag is set to
RAY_FLAG_FORCE_NON_OPAQUE
which will use slow path of ray traversal on
mobile. Change it to 0can speed upthe ray
traversal performance up to 32%
onImmortalisG720.
From 30 fps to 40 fps in Steel Arms case.
*This is for Unreal Engine 5.3

37© 2024 Arm
Lumen General Setting Optimization
●Lumen Scene Detail
●Higher value can make sure smaller objects can also contribute to Lumen lighting but will also
increase GPU cost
●Final Gather Quality
●Control the density of the screen probes, higher value increase GPU cost
●1.0should reach a good balance between performance and quality for mobile game
●Max Trace Distance
●Control how far the ray tracing will go, keep it small can decrease GPU cost
●Don’t set it bigger than the size of the scene

38© 2024 Arm
Lumen General Setting Optimization
●Scene Capture Cache Resolution Scale
●Control the surface cache resolution, smaller value can save memory
●Lumen Scene Lighting Update Speed
●Can keep it low if the lighting changes are slow to save GPU cost
●0.5 ~ 1.0should reach a good balance between performance and quality for mobile
game
●Final Gather Lighting Update Speed
●Can keep it low if slow lighting propagation is acceptable
●0.5 ~ 1.0should reach a good balance between performance and quality for mobile
game

39© 2024 Arm
Lumen General Setting Optimization
●Reflection Quality
●Control the reflection tracing quality
●Ray Lighting Mode
●HitLightingis available when using hardware ray tracing, it evaluates direct lighting instead of
using surface cache
●HitLightingmode has higher quality with higher GPU cost
●HitLightingmode can reflect direct lighting of skinned mesh
●Unfortunately HitLightingmode is not supported on mobile yet
●Max Reflection Bounces
●Control the amount of reflection bounces, higher value has higher GPU cost

40© 2024 Arm
Lumen Scene Lighting Optimization
●r.LumenScene.DirectLighting.MaxLightsPerTile
●Control the maximum number of lights per tile for direct lighting evaluation
●r.LumenScene.DirectLighting.UpdateFactor
●Control the per frame update area of direct lighting, higher value improve the performance
●r.LumenScene.Radiosity.UpdateFactor
●Control the per frame update area of indirect lighting, higher value improve the performance

41© 2024 Arm
Lumen Scene Lighting Optimization
●r.LumenScene.Radiosity.ProbeSpacing
●Control the density of probes, higher value improve the performance by placing less probes
●r.LumenScene.Radiosity.HemisphereProbeResolution
●The resolution of probe, lower value can save memory
●r.LumenScene.FarField
●Set it to 0 if you don’t need far-field hardware ray tracing
●r.DistanceFields.SupportEvenIfHardwareRayTracingSupported
●Set it to 0 if you don’t need software Lumen support, save memory and scene update cost

42© 2024 Arm
Lumen Screen Probe Gather Optimization
●r.Lumen.ScreenProbeGather.RadianceCache.ProbeResolution
●Control the probe atlas texture size, lower value save the memory
●r.Lumen.ScreenProbeGather.RadianceCache.NumProbesToTraceBudget
●Control the number of probes to be updated per frame, lower value improves the performance
●r.Lumen.ScreenProbeGather.DownsampleFactor
●Factor to downsamplethe GI resolution, higher value improves the performance

43© 2024 Arm
Lumen Screen Probe Gather Optimization
●r.Lumen.ScreenProbeGather.TracingOctahedronResolution
●Control the number of rays per screen probe, lower value improves the performance
●r.Lumen.ScreenProbeGather.ScreenTraces
●Using screen trace or not
●Set it to True is recommended
●r.Lumen.ScreenProbeGather.ScreenTraces.HZBTraversal.FullResDepth
●Using full resolution depth for screen trace or not. Set 0 to improve the performance
●r.Lumen.ScreenProbeGather.ShortRangeAO
●Enable short range ambient occlusion or not

44© 2024 Arm
Short Range AO OFF Short Range AO ON

45© 2024 Arm
Lumen Reflection Optimization
●r.Lumen.Reflections.RadianceCache
●Resusethe radiance cache for reflection or not
●Set 1 to speed up ray tracing
●r.Lumen.Reflections.DownsampleFactor
●Downsamplefactor for reflection, higher value improves the performance
●r.Lumen.Reflections.MaxRoughnessToTrace
●Set the max roughness value for which dedicated reflection rays should be traced
●Otherwise the reflection will reuse the screen space radiance cache

46© 2024 Arm
A Community to Build the
Future on Arm
Join the Arm Developer Program
arm.com/developerprogram

47© 2024 Arm
Arm Developer Hub
developer.arm.com

Thank You
Danke
Gracias
Grazie
谢谢
ありがとう
Asante
Merci
감사합니다
धन्यवाद
Kiitos
اًركش
ধন্যবাদ
הדות
ధన్య వాదములు
© 2024 Arm

The Arm trademarks featured in this presentation are registered
trademarks or trademarks of Arm Limited (or its subsidiaries) in
the US and/or elsewhere. All rights reserved. All other marks
featured may be trademarks of their respective owners.
www.arm.com/company/policies/trademarks
© 2024 Arm