Ray tracing and RT cores
When the GeForce 20 series was announced, almost the entire presentation was dedicated to ray tracing. How do the new GPUs do what even the GTX 1080 Ti cannot?
The secret lies with the new RT (ray tracing) cores. As mentioned, each SM features one RT core – RTX 2080 Ti has 68, and RTX 2080 has 46. These cores work together with denoising techniques, Bounding Volume Hierarchy (BVH) and compatible APIs (DXR and even Vulkan) to achieve real-time ray tracing on each Turing GPU.
As Nvidia puts its, ‘RT Cores traverse the BVH autonomously, and by accelerating traversal and ray/triangle intersection tests, they offload the SM, allowing it to handle other vertex, pixel, and compute shading work. Functions such as BVH building and refitting are handled by the driver, and ray generation and shading is managed by the application through new types of shaders.’
Without dedicated ray tracing hardware (i.e. Turing GPUs), each ray can only be traced using thousands of software instruction slots, which test bounding boxes within the BVH structure, until a triangle is hit – which is still only a possibility. The end result is, without dedicated RT hardware, the effort is so demanding that it couldn’t be done in real-time. Instead, RT cores can be used to prevent the SM from expending thousands of instruction slots, thus significantly lessening the workload.
Each RT core is also made up of two units. One does the box bounding tests, while the other handles ray-triangle intersection tests. All the SM has to do is launch the ray probe, and the RT cores will handle the rest of the work to actually test if there has been a hit or not – data which is then returned to the SM.
The end result is a GPU that can run real-time ray tracing 10 times faster than Pascal – as demonstrated by the amount of Giga Rays per second that can be calculated: 1.1 Giga Rays per second for the GTX 1080 Ti, but 10 Giga Rays per second for the RTX 2080 Ti.
Ray tracing and gaming
Speaking now about gaming, the RT cores should allow for real-time ray tracing to be built into games. For this aspect, Nvidia isn’t yet pushing global ray tracing in games (where all light is as a result of ray tracing). Instead, the company is pushing hybrid rendering – which combines ray tracing with rasterisation, the latter being the rendering technique games currently use.
This can be demonstrated with Shadow of the Tomb Raider, a game that was demoed at the RTX launch event in Germany. This game will use the hybrid method, utilising ray tracing for the portrayal of shadows in-game. While we can’t test any fully ray traced games at this point – because they don’t exist, and we still need DXR – using ray tracing exclusively is almost certainly going to be too demanding, even for Turing GPUs.
So we expect to see developers pick and choose how ray tracing is used within games. Nvidia says it should be used where there is ‘most visual benefit’, like when ‘rendering reflections, refractions, and shadows’.
At the moment, a list of games that will support ray tracing includes:
- Assetto Corsa Competizione from Kunos Simulazioni / 505 Games
- Atomic Heart from Mundfish
- Battlefield V from EA / DICE
- Control from Remedy Entertainment / 505 Games
- Enlisted from Gaijin Entertainment / Darkflow Software
- MechWarrior 5: Mercenaries from Piranha Games
- Metro Exodus f rom 4A Games
- Shadow of the Tomb Raider from Square Enix / Eidos-Montréal / Crystal Dynamics / Nixxes
- Justice (Ni Shui Han) from NetEase 10. JX3 from Kingsoft
Tensor Cores and DLSS
The Turing architecture also houses what is known as a Tensor Core – another feature first introduced with the Volta GV100. These add ‘INT8 and INT4 precision modes for inferencing workloads that can tolerate quantization.’
What the Tensor Cores do, though, is harness the power of deep learning for the purposes of gaming. They do this by accelerating certain aspects of Nvidia’s NGX Neural Services in order to improve graphics, visual fidelity and rendering. For gaming this is primarily achieved via Deep Learning Super Sampling, or DLSS.
This is a new method of anti aliasing that aims to provide similar visual fidelity to TAA (temporal anti aliasing) but with significantly less performance cost. This is because, where TAA renders at your set resolution, DLSS can render faster with a lower input sample count, but then infers (upscales) the result at your set resolution – which Nvidia claims results in similar visual fidelity, but with half of the shading work.
Obviously that sounds great, but how is it achieved? That’s where we circle back to the deep learning element of the Tensor Cores. Nvidia says it is the ‘training’ element of the neural network which is key, where the DLSS network is asked to match thousands of high quality images (rendered with 64x supersampling). Through a back-and-forth process named ‘back propagation’ this network eventually learns to produce results which resemble the quality of the 64x supersampled images while getting rid of any blurring that may have been introduced from TAA.
In a nutshell – Turing GPUs only need half the amount of samples (compared to TAA, for instance) for rendering and instead use AI and their Tensor Cores to provide the missing information and create the final image.
Now, because the network needs to be trained in regards to different scenes, games do need to specifically support DLSS, although Nvidia claims it is ‘an easy integration’ for game devs. We currently have a list of 25 games that will support DLSS upon release:
- Ark: Survival Evolved from Studio Wildcard
- Atomic Heart from Mundfish
- Dauntless from Phoenix Labs
- Final Fantasy XV from Square Enix
- Fractured Lands from Unbroken Studios
- Hitman 2 from IO Interactive/Warner Bros.
- Islands of Nyne: Battle Royale from Define Human Studios
- Justice (Ni Shui Han) from NetEase
- JX3 from Kingsoft
- Mechwarrior 5: Mercenaries from Piranha Games
- PlayerUnknown’s Battlegrounds from PUBG Corp.
- Remnant: From the Ashes from Gunfire Games/Perfect World Entertainment
- Serious Sam 4: Planet Badass from Croteam/Devolver Digital
- Shadow of the Tomb Raider from Square Enix/Eidos-Montréal/Crystal Dynamics/Nixxes
- The Forge Arena from Freezing Raccoon Studios
- We Happy Few from Compulsion Games / Gearbox
- Darksiders 3 by Gunfire Games/THQ Nordic
- Deliver Us The Moon: Fortuna by KeokeN Interactive
- Fear the Wolves by Vostok Games / Focus Home Interactive
- Hellblade: Senua’s Sacrifice by Ninja Theory
- KINETIK by Hero Machine Studios
- Outpost Zero by Symmetric Games / tinyBuild Games
- Overkill’s The Walking Dead by Overkill Software / Starbreeze Studios
- SCUM by Gamepires / Devolver Digital
- Stormdivers by Housemarque
That does mean we can’t test DLSS today with any current games, but we have been able to test a demo of FFXV provided early to press. More on that later in the review.