Nvidia unveiled a significant evolution in its DLSS upscaling technology at CES 2025, introducing a new model as part of the DLSS 4 upgrade. This advancement, which also includes enhancements to DLSS Ray Reconstruction, Super Resolution, and DLAA, marks Nvidia's first implementation of a transformer-based AI model in a real-time application.
After a beta phase spanning almost six months, this transformer-based DLSS model is now out of beta (version 310.3.0). This development lays the groundwork for Nvidia's future upscaling and image quality models. Given that CNN-based DLSS was utilised for over six years, it is plausible that this new transformer-based technique will similarly evolve and improve over an extended period.
Previously, DLSS relied on Convolutional Neural Networks (CNNs) to generate pixels by focusing on localised areas within current and preceding frames. However, Nvidia recognized that CNNs had reached their performance ceiling, and simply releasing new upscaling profiles was no longer sufficient to drive further image quality improvements.
The new transformer model represents a more sophisticated approach. It boasts twice as many parameters as prior DLSS upscaling models and fundamentally shifts its focus from localised content. Instead, this visual transformer analyses the entire frame and evaluates the importance of each pixel, even tracking their significance across multiple frames.
Nvidia asserts that its transformer model allows a more comprehensive understanding of scenes, bringing numerous benefits. Those include stabler pixels, effectively reducing shimmering and other visual artifacts, diminishing ghosting, and better preserving fine details, especially during motion. In addition, edges appear smoother, and the overall image quality for ray reconstruction is substantially enhanced, particularly in scenes with challenging lighting conditions.
Discuss on our Facebook page, HERE.
KitGuru says: Have you already tried the DLSS transformer model? Did you notice any difference compared to the older CNN-based models?