NVIDIA has unveiled yesterday the new family of graphics cards based on the new Ada Lovelace architecture. The GeForce RTX 4090 (16384 CUDA cores, 24GB GDDR6X), the flagship of the RTX 40 Series is priced at USD $1,599 and will be available on October 12, 2022. The GeForce RTX 4080 is available in two configurations (different memory size): the RTX 4080 with 16GB GDDR6X and 9728 CUDA cores will start at USD $1199 while the RTX 4080 with 12GB GDDR6X and 7680 CUDA cores will cost USD $899. Both RTX 4080 will be available on November 2022.
Some details about the specifications are available HERE.
The new GeForce RTX logo:
The complete press release:
NVIDIA today unveiled the GeForce RTX® 40 Series of GPUs, designed to deliver revolutionary performance for gamers and creators, led by its new flagship, the RTX 4090 GPU, with up to 4x the performance of its predecessor.
The world’s first GPUs based on the new NVIDIA® Ada Lovelace architecture, the RTX 40 Series delivers massive generational leaps in performance and efficiency, and represents a new era of real-time ray tracing and neural rendering, which uses AI to generate pixels.
“The age of RTX ray tracing and neural rendering is in full steam, and our new Ada Lovelace architecture takes it to the next level,” said Jensen Huang, NVIDIA’s founder and CEO, at the GeForce® Beyond: Special Broadcast at GTC.
“Ada provides a quantum leap for gamers and paves the way for creators of fully simulated worlds. With up to 4x the performance of the previous generation, Ada is setting a new standard for the industry,” he said.
DLSS 3 Generates Entire Frames for Faster Game Play
Huang also announced NVIDIA DLSS 3 — the next revolution in the company’s Deep Learning Super Sampling neural-graphics technology for games and creative apps. The AI-powered technology can generate entire frames for massively faster game play. It can overcome CPU performance limitations in games by allowing the GPU to generate entire frames independently.
The technology is coming to the world’s most popular game engines, such as Unity and Unreal Engine, and has received support from many of the world’s leading game developers, with more than 35 games and apps coming soon.
Additionally, the RTX 40 Series GPUs feature a range of new technological innovations, including:
– Streaming multiprocessors with up to 83 teraflops of shader power — 2x over the previous generation.
– Third-generation RT Cores with up to 191 effective ray-tracing teraflops — 2.8x over the previous generation.
– Fourth-generation Tensor Cores with up to 1.32 Tensor petaflops — 5x over the previous generation using FP8 acceleration.
– Shader Execution Reordering (SER) that improves execution efficiency by rescheduling shading workloads on the fly to better utilize the GPU’s resources. As significant an innovation as out-of-order execution was for CPUs, SER improves ray-tracing performance up to 3x and in-game frame rates by up to 25%.
– Ada Optical Flow Accelerator with 2x faster performance allows DLSS 3 to predict movement in a scene, enabling the neural network to boost frame rates while maintaining image quality.
– Architectural improvements tightly coupled with custom TSMC 4N process technology results in an up to 2x leap in power efficiency.
– Dual NVIDIA Encoders (NVENC) cut export times by up to half and feature AV1 support. The NVENC AV1 encode is being adopted by OBS, Blackmagic Design DaVinci Resolve, Discord and more.
New Ray-Tracing Tech for Even More Immersive Games
For decades, rendering ray-traced scenes with physically correct lighting in real time has been considered the holy grail of graphics. At the same time, geometric complexity of environments and objects has continued to increase as 3D games and graphics strive to provide the most accurate representations of the real world.
Achieving physically accurate graphics requires tremendous computational horsepower. Modern ray-traced games like Cyberpunk 2077 run over 600 ray-tracing calculations for each pixel just to determine lighting — a 16x increase from the first ray-traced games introduced four years ago.
The new third-generation RT Cores have been enhanced to deliver 2x faster ray-triangle intersection testing and include two important new hardware units. An Opacity Micromap Engine speeds up ray tracing of alpha-test geometry by a factor of 2x, and a Micro-Mesh Engine generates micro-meshes on the fly to generate additional geometry. The Micro-Mesh Engine provides the benefits of increased geometric complexity without the traditional performance and storage costs of complex geometries.
Creativity Redefined With RTX Remix, New AV1 Encoders
The RTX 40 Series GPUs and DLSS 3 deliver advancements for NVIDIA Studio creators. 3D artists can render fully ray-traced environments with accurate physics and realistic materials, and view the changes in real time, without proxies.
Video editing and live streaming also get a boost from improved GPU performance and the inclusion of new dual, eighth-generation AV1 encoders. The NVIDIA Broadcast software development kit has three updates, now available for partners, including Face Expression Estimation, Eye Contact and quality improvements to Virtual Background.
NVIDIA Omniverse™ — included in the NVIDIA Studio suite of software — will soon add NVIDIA RTX Remix, a modding platform to create stunning RTX remasters of classic games. RTX Remix allows modders to easily capture game assets, automatically enhance materials with powerful AI tools, and quickly enable RTX with ray tracing and DLSS.
Portal Is RTX ON!
RTX Remix has been used by NVIDIA Lightspeed Studios to reimagine Valve’s iconic video game Portal, regarded as one of the best video games of all time. Advanced graphics features such as full ray tracing and DLSS 3 give the game a striking new look and feel. Portal with RTX will be released as free, official downloadable content for the classic platformer with RTX graphics in November, just in time for Portal’s 15th anniversary.
The GeForce RTX 4090 and 4080: The New Ultimate GPUs
The RTX 4090 is the world’s fastest gaming GPU with astonishing power, acoustics and temperature characteristics. In full ray-traced games, the RTX 4090 with DLSS 3 is up to 4x faster compared to last generation’s RTX 3090 Ti with DLSS 2. It is also up to 2x faster in today’s games while maintaining the same 450W power consumption. It features 76 billion transistors, 16,384 CUDA® cores and 24GB of high-speed Micron GDDR6X memory, and consistently delivers over 100 frames per second at 4K-resolution gaming. The RTX 4090 will be available on Wednesday, Oct. 12, starting at $1,599.
The company also announced the RTX 4080, launching in two configurations. The RTX 4080 16GB has 9,728 CUDA cores and 16GB of high-speed Micron GDDR6X memory, and with DLSS 3 is 2x as fast in today’s games as the GeForce RTX 3080 Ti and more powerful than the GeForce RTX 3090 Ti at lower power. The RTX 4080 12GB has 7,680 CUDA cores and 12GB of Micron GDDR6X memory, and with DLSS 3 is faster than the RTX 3090 Ti, the previous-generation flagship GPU.
Both RTX 4080 configurations will be available in November, with prices starting at $1,199 and $899, respectively.
Where to Buy
The GeForce RTX 4090 and 4080 GPUs will be available as custom boards, including stock-clocked and factory-overclocked models, from top add-in card providers such as ASUS, Colorful, Gainward, Galaxy, GIGABYTE, Inno3D, MSI, Palit, PNY and Zotac.
The RTX 4090 and RTX 4080 (16GB) are also produced directly by NVIDIA in limited Founders Editions for fans wanting the NVIDIA in-house design.
Look for the GeForce RTX 40 Series GPUs in gaming systems built by Acer, Alienware, ASUS, Dell, HP, Lenovo and MSI, leading system builders worldwide, and many more.
More details are available on THIS PAGE:
Ada is incredibly efficient, with over twice the performance at the same power compared to Ampere, and excellent scalability and overclockability as power is increased.
Shader Execution Reordering
GPU architecture is highly parallelized and at its most efficient when executing similar workloads at the same time. However, advanced ray tracing requires computing the impact of millions of rays striking numerous different material types throughout a scene, creating a sequence of divergent, inefficient workloads for shaders (shaders calculate the appropriate levels of light, darkness, and color during the rendering of a 3D scene, and are used in every modern game).
Our new Shader Execution Reordering (SER) technology dynamically reorganizes these previously-inefficient workloads into considerably more efficient ones, improving shader performance by up to 2X, and in-game frame rates by up to 25%!
GeForce RTX 4090
JUSTICE – Fuyun Court – Path Tracing Showcase Premiere
Marvel at the beautiful world of Justice enhanced with a new path-traced upgrade with the launch of Fuyun Court, a new location that all players can access from October 12th when the next Justice game update is released.
NVIDIA Racer RTX – The future of graphics powered by GeForce RTX 40 Series
Enjoy this cinematic teaser for Racer RTX, built in NVIDIA Omniverse and powered by the #BeyondFast GeForce RTX 40 Series.
Racer RTX showcases the latest NVIDIA technologies including real time ray tracing, DLSS 3, and PhysX. Available as a playable tech demo for GeForce RTX 40 Series GPUs this November, Racer RTX is an interactive physics-accurate simulation featuring the most realistically rendered RC cars ever.
Portal with RTX
More information: Portal with RTX Reimagines Valve’s Classic with Full Ray Tracing, NVIDIA DLSS & NVIDIA Reflex
Gigabyte unveiled the physical specs of its AORUS RTX 4090 Master 24GB. The graphics card comes with a colossal VGA cooler. The dimensions of the card (length x width x height): 358.5mm x 162.8mm x 75.1mm.
More detailed specifications are available:
GeForce RTX 4090
– GPU: AD102, TSMC 4N (5nm), 76.3B transistors, 608mm2
– CUDA cores: 18432
– SMs (Streaming Multiprocessors): 144
– Tensor cores: 576
– Ray Tracing cores: 144
– ROPs: 192
– L2 cache: 96MB
GeForce RTX 4080 16GB
– GPU: AD103, TSMC 4N (5nm), 45.9B transistors, 378.6mm2
– CUDA cores: 10240
– SMs (Streaming Multiprocessors): 80
– Tensor cores: 320
– Ray Tracing cores: 80
– ROPs: 112
– L2 cache: 64MB
GeForce RTX 4080 12GB
– GPU: AD103, TSMC 4N (5nm), 35.9B transistors, 294.5mm2
– CUDA cores: 7680
– SMs (Streaming Multiprocessors): 60
– Tensor cores: 240
– Ray Tracing cores: 60
– ROPs: 80
– L2 cache: 48MB
The difference in specifications is important between RTX 4080 16GB and RTX 4080 12GB. The RTX 4080 16GB has more transistors, more cores, more ROPs, more memory. At the end it’s a different graphics card, so why both cards have the same name? The 16GB version should have called RTX 4080 Ti or something like that. Or a correct name for the RTX 4080 12GB would have been RTX 4070. Incoherent!
In this article, NVIDIA explains the name of the RTX 4080:
The GeForce RTX 4080 16GB and 12GB naming is similar to the naming of two versions of RTX 3080 that we had last generation, and others before that. There is an RTX 4080 configuration with a 16GB frame buffer, and a different configuration with a 12GB frame buffer. One product name, two configurations.
The 4080 12GB is an incredible GPU, with performance exceeding our previous generation flagship, the RTX 3090 Ti and 3x the performance of RTX 3080 Ti with support for DLSS 3, so we believe it’s a great 80-class GPU. We know many gamers may want a premium option so the RTX 4080 16GB comes with more memory and even more performance. The two versions will be clearly identified on packaging, product details, and retail so gamers and creators can easily choose the best GPU for themselves.
The diagram of a SM:
NVIDIA has published a whitepaper that describes in depth the architecture of the AD102 GPU:
NVIDIA ADA GPU ARCHITECTURE (40-page PDF)
The GeForce RTX 4090 about about to be launched and some stress tests are popping up here and there. The latest stress test, done with our venerable MSI Kombustor, shows a power consumption that reaches 425W on the MSI-01 graphics test:
This test is less violent than FurMark and tries to reach the graphics workload of a real game.
In a second stress test, the RTX 4090 has been able to draw up to 616W:
This second stress test is a variant of FurMark that uses larger textures and a lot of VRAM.
MSI Kombustor is powered by GeeXLab and it’s funny to see where the GeeXLab engine is used…
Reviews of the GeForce RTX 4090 Founders Edition are available.
GeForce RTX 4090 24GB Specs
- GPU: AD103, TSMC 4N (5nm), 76B transistors, 608mm2, base clock: 2230 MHz, boost clock: 2520 MHz
- CUDA cores: 16384
- SMs (Streaming Multiprocessors): 128
- TMUs: 512
- ROPs: 176
- Tensor cores: 512
- Ray tracing cores: 128
- Memory: 24GB GDDR6X, 21 Gbps, 384-bit
- TBP: 450 W
- FP32 compute: 83 TFLOPS
- Price: USD $2400
Power consumption of the RTX 4090 FE:
MSI Kombustor used to torture the RTX 4090:
NVIDIA GeForce RTX 4090 Founders Edition Review & Benchmarks: Gaming, Power, & Thermals
Is the fastest GPU ALWAYS the best?
Is the $1599 RTX 4090 Worth it?? 4090 Benchmarked!
Reviews of the GeForce RTX 4080 are available.
GeForce RTX 4080 16GB Specs
- GPU: AD103, TSMC 4N (5nm), 45.9B transistors, 378.6mm2, base clock: 2210 MHz, boost clock: 2510 MHz
- CUDA cores: 9728
- SMs (Streaming Multiprocessors): 76
- TMUs: 320
- ROPs: 112
- Tensor cores: 304
- Ray tracing cores: 76
- Memory: 16GB GDDR6X, 23 Gbps, 256-bit
- TBP: 320 W
- FP32 compute: 49 TFLOPS
- Price: USD $1200
- ASUS RTX 4080 Strix – techpowerup.com
- RTX 4080 FE – techpowerup.com
- RTX 4080 FE – overclock3d.net
- RTX 4080 FE – igorslab.de
- RTX 4080 – comptoir-hardware.com
- ROG Strix RTX 4080 – coolaler.com
Power consumption of the RTX 4080 FE:
NVIDIA’s Lost It: RTX 4080 16GB GPU Review & Benchmarks
We Found One Good Thing About the GeForce RTX 4080…