AMD’s new High-End Desktop (HEDT) flagships have launched with an updated socket and a brand-new chipset. The new 32-core Ryzen Threadripper 3970X leads the stack and the 24-core 3960X backs it up. Based on the new Zen 2 architecture and TSMC’s 7nm process technology, can these $1399 and $1999 processors prove their worth in demanding workloads? And how will they perform against Intel’s price-reduced HEDT competition from Cascade Lake-X?
Watch via our Vimeo Channel (Below) or over on YouTube at 2160p HERE
The $1999 3970X will ship in 32-core, 64-thread form via its 8 CCX and 4 CCD design. The lower cost 3960X is priced at $1399 and its 24 cores are built on the same four CCD design but with two cores per CCD disabled. The pricing for Threadripper clearly spells a hefty price premium for these parts versus AM4 Ryzen 3000. However, these are now the highest core count prosumer/non-server CPUs on the market (excluding AMD’s old Zen+ products) … until the 64-core Threadripper 3990X in 2020 – yes, it has been confirmed!
Base frequency for the 32-core part is 3.7GHz and the 3960X is 100MHz higher at 3.8GHz. Both chips feature a maximum Precision Boost 2 frequency of 4.5GHz. That is pretty solid for such high core count chips and clearly shows the value of TSMC’s efficient 7nm process, especially when the reasonable 280W TDP is factored in. The frequencies represent an increase versus the 12nm Global Foundries-based Threadripper 2000 WX processors that these chips now sit alongside, not replace, in AMD’s product stack.
You get 128MB of L3 cache on both CPUs by virtue of 8 CCXs being deployed. And the same 512KB of L2 per core as for other Zen 2 parts is used, taking the 3970X’s total L2+L3 cache to 144MB and the 3960X’s to 140MB. Each CCX is connected to the dedicated IO die via Infinity Fabric links which have their frequency tied to the DRAM clock from your memory modules.
Support for quad-channel memory with up to 256GB capacity is provided due to the 32GB capacity limitation of currently available DIMMs in an eight-slot board. ECC support is available, but this is down to the motherboard vendor to implement. In terms of frequency, the situation is slightly complex. Fundamentally, memory support is the same as with Ryzen 3000 as this is still using the Zen 2 memory controller. However, there are more DIMMs to power and manage for Threadripper’s quad-channel configuration.
For 4-DIMM configurations, whether they’re single rank or dual rank, you get official support for 3200MHz. So, 4x8GB, 4x16GB, and 4x32GB will run at 3200MHz. 8x8GB is officially supported at 2933MHz, 8x16GB at 2667MHz, and 8x32GB at 2667MHz. With that said, motherboard support for speeds of DDR4-3600MHz and above will be widespread, as we have seen with Ryzen 3000.
As is the case with Zen 2 Ryzen 3000, passing DDR4-3600MHz memory speeds will trigger a 2:1 divider between the memory clock and the memory controller clock, which imparts a latency penalty. It is best to either manually overclock or lock the memory controller clock if you’re using higher-than-3600MHz DDR4. DDR4-3600MHz C16 is still a good sweet spot according to AMD, and we still like and continue to use DDR4-3200MHz C14.
Due to improved connectivity between the CPU and chipset, in addition to the ‘future scalability’ of the TRX40 platform, Ryzen Threadripper 3000 sTRX4 socket CPUs are not backwards compatible with the socket TR4/X399 platform and vice versa. This is annoying to users who have invested heavily in their X399 motherboard or those who were hoping to jump on a cheap Threadripper 2920X before upgrading to the new 24-core or 32-core (or 64-core) model in the future.
However, this doesn’t seem to be a change of socket just for the sake of changing socket. There’s a fatter 8-lane PCIe Gen 4 pipe between the CPU and chipset, which is certainly a feature worthy of a motherboard upgrade. Additionally, the higher TDP for the processors and the promise of a 64-core chip in 2020, seem like valid reasons for AMD’s decision to go with a new socket.
Thankfully, at least your TR4 cooler will continue to work as the mounting points have been maintained.
The CPU features a total of 64 PCIe 4.0 lanes, and motherboard vendors are given a strong amount of flexibility as to their deployment. Directly from the CPU, two sets of four-lane links can be deployed as PCIe 4.0 x4 slots, four SATA connections, or an x4 PCIe 4.0 NVMe mount. This is defined by the motherboard vendor and will vary between models.
That leaves 56 lanes on the CPU but ‘only’ 48 of these are usable to the expansion slots or other devices. Instead, 8 PCIe 4.0 lanes are used to connect the Ryzen Threadripper processor with the TRX40 chipset.
That 8-lane Gen 4 connection between the CPU and chipset provides massive bandwidth. And it may be of importance to users shifting a lot of data between the CPU and chipset, such as through hefty RAID arrays, high-speed networking cards, and maybe video capture devices.
Clearly, both in terms of price and core counts, Threadripper 3000 doesn’t have a logical desktop (not Xeon) competitor from Intel given the new Cascade Lake-X HEDT pricing. Intel’s LGA 2066-based HEDT parts are competitors through way of being HEDT platforms, supporting quad-channel memory, and featuring a large number of on-chip PCIe lanes.
That’s the main comparison we will be focussing on in today’s review, despite the significant cost differences between the X299 platform with Intel’s revised Cascade Lake-X pricing and the new TRX40 platform. Also worth highlighting are the performance improvements versus previous-generation Threadripper 2000 processors, especially the 24-core, now-£900-ish Threadripper 2970WX and the 32-core, currently £1700 Threadripper 2990WX.
We expect both of these parts to drop in price to plug the gap between £750-1400 in AMD’s current CPU line up.
Before we dive into the performance for Threadripper 3000, let’s take a closer look at the new TRX40 platform and AMD’s recipe for fixing last year’s problematic Threadripper 2970WX and 2990WX – the Zen 2 IO die.
Roughly 4x the bandwidth of the CPU-to-chipset link seen on previous Threadripper X399 platforms and 4x that of Intel’s X299 platform speaks volumes about the TRX40 platform’s heavy IO capacity. That’s important, as even a single PCIe 3.0 x4 SSD can more-or-less saturate the Intel X299 DMI or AMD X399 4-lane chipset link on its own.
Also worth noting is that a Gen 4 x8 link is double the bandwidth of the X570 platform as that also uses PCIe Gen 4.0 but only at 4 lanes width.
The new connection is a long overdue move, especially for the HEDT platform. It now means that high bandwidth devices and heavy RAID arrays can reasonably be hung off the chipset without being subject to a severe bottleneck when communicating with the CPU. A really smart move, in our opinion. Kudos, AMD.
We already mentioned the high-bandwidth link between the CPU and chipset, but the chipset itself is also different compared to X399. Built on the Global Foundries 14nm process, what we see from TRX40 is a set of features that is very similar to X570.
You get 8 dedicated USB 3.2 Gen 2 10Gbps ports, 4 dedicated USB 2.0 ports for legacy purposes, and 4 dedicated SATA 6Gbps ports. In addition, you get access to 8 general purpose PCI Gen 4 lanes, and two sets of PCIe 4.0 4-lane links (and the associated bifurcations) or four SATA 6Gbps ports.
Add in the eight-lane PCIe Gen 4 uplink to the Ryzen Threadripper CPU, in addition to the quartet of on-chip USB 3.2 Gen 2 10Gbps ports, and it is clear to see why TRX40 is such a connectivity-heavy platform.
Worth pointing out is that the TRX40 chipset can be power hungry, just like X570. Built on Global Foundries 14nm technology and with a 15W peak TDP according to AMD, we will see most motherboard vendors using active fan cooling for the chipset, as we see with X570.
This is perhaps more important for TRX40 than X570, given the prosumer/workstation platform’s tendency to run IO heavy, sustained workloads for extended periods of time.
Clearly, the Zen 2 architecture and 7nm TSMC technology for the cores are what has given AMD such a strong market potential with Ryzen 3000 and EPYC Rome up to this point. We already know most of the details there, though, so I want to spend a little more time discussing the IO die for Threadripper and its significance.
This is likely to be the single component that holds the key to unlocking the performance of these high-core-count Threadripper chips when we saw last year’s Threadripper 2000 WX processors seriously struggle in many use cases.
Ryzen Threadripper in previous generations was, more or less, a cut down EPYC server CPU with certain features disabled, such as additional memory controllers or PCIe lane links. This created NUMA and latency headaches for cores trying to access memory that was off die.
This time round, however, the design for Zen 2-based processors is different due to the use of a central IO die that is physically separate from the core chiplets and connects to them via Infinity Fabric. Being segregated, NUMA and PCIe access penalties should not enter the equation anywhere near as significantly as they did with previous generation Threadripper WX parts.
The IO die for Ryzen Threadripper’s 3960X and 3970X comes in at 416mm2 with around 8.34 billion transistors built on Global Foundries’ 12nm process. That’s massive. Absolutely massive, at approximately 5.6 times the size of the 74mm2 7nm TSMC core chiplets. For perspective, a GTX 1080 is 314mm2 built on TSMC’s 16nm process, while the newer RTX 2060 is 445mm2 with TSMC 12nm lithography.
Despite the physical dimensions looking identical, Threadripper does not deploy the same set of die features that is used for EPYC. The key difference is the reduction in memory channels and PCIe lanes. Ryzen Threadripper’s IO die uses a pair of two-channel DDR4 memory blocks to provide quad-channel capability. This is a reduction versus EPYC Rome’s eight-channel DDR4 support.
Equally so, the 64-lane PCIe Gen 4 capacity is provided by a pair of 32-lane blocks – 64 lanes less than what EPYC Rome gets in its primary blocks (excluding EPYC’s ‘bonus’ links). With Threadripper 3000 in its 4-CCD 3960X and 3970X form, four Infinity Fabric interconnect blocks link each individual core chiplet with the central IO die.
The IO die for Ryzen Threadripper 3000 is clearly a large, complex, and not inexpensive piece of silicon. However, it may also prove to be the stroke of genius for these higher core count parts that enables them to avoid NUMA and accessibility latency penalties that crippled performance, especially in certain Windows workloads, on previous Threadripper 2000 WX offerings.
We also get an interesting insight into the future flagship for the TRX40 platform – the Threadripper 3990X 64-core part that will require eight fully populated CCDs.
Our objective with CPU overclocking is to hit frequencies that we think will be achievable for daily use by the platform’s buyers. As such, we test with sensible cooling hardware in the Cooler Master Wraith Ripper CPU cooler. We also aim to use sensible voltages that lead to manageable thermal results.
Stability is confirmed by running multiple Cinebench tests, Handbrake video conversion, and extended Blender rendering runs. We do not use Prime95 as we have found it to be overly demanding as a stress test application with the more recent AVX-capable versions.
The partnering hardware of choice is the ASUS ROG Zenith Extreme II motherboard, 32GB of 3200MHz CL14 DDR4, and a Seasonic Prime 1000W Titanium PSU.
AMD Ryzen Threadripper 3960X & 3970X Stock Frequencies:
On the 3970X, all-core frequencies hovered around the 3850-3900MHz mark. For our extended Blender Classroom rendering run, the average speed across all cores was 3850MHz. Long back-to-back Cinebench runs also held stable at 3850MHz.
For the 24-core 3960X, all-core frequencies were higher by around 175MHz. The lower core count part was able to use its power headroom to hover at around 4025MHz for our longer Blender Classroom and Cinebench R20 all-core runs.
Maximum boosts on the 3970X regularly hit 4500MHz in our Cinebench R20 1T testing. We saw that the processor was generally happier operating at 4467MHz than 4492MHz, though. The 3960X showed the same boost behaviour with 4492MHz being hit several times but 4467MHz being held far more consistently.
AMD Ryzen Threadripper 3960X & 3970X Precision Boost Overdrive + 200MHz Auto OC Frequencies:
All-core loading under PBO+200MHz Auto OC for the 3970X starts at 4025MHz and settles at 4000MHz average across all cores for our extended Blender Classroom test. Similar behaviour was observed in Cinebench R20, though clock speeds tapered off a little more down to 3950MHz all-core after multiple runs.
Importantly, Cinebench R20 1T clock speeds improved, with a vast majority of time spent at 4492MHz or 4517MHz.
All-core loading under PBO+200MHz Auto OC for the 3960X starts at 4167MHz and settles at 4110MHz average across all cores in our extended Blender Classroom test. Cinebench R20 operates in the same way, with the all-core clock speed tapering down to 4100MHz average after multiple runs.
Cinebench R20 1T clock speeds increased to over 4500MHz average, sitting at 4542MHz for many of our data points. This clearly highlights the benefit of PBO overclocking compared to manual static overclocks.
AMD Ryzen Threadripper 3960X & 3970X Manual Overclocking Frequencies:
Manual overclocking was challenging due to the thermal headaches presented by such high core count processors.
For the 3970X, we were limited to 1.275V in the UEFI using Level 7 loadline calibration and this resulted in 1.248V under load according to CPU-Z. 4.1GHz was the highest frequency we could hold stable at this voltage and temperatures were hitting against the 100°C mark.
With better cooling, this should be a reasonable frequency to achieve, but it takes significant cooling power to gain this 250MHz versus the stock Precision Boost 2 clocks, plus you lose the loftier low-core-count boost benefits.
This required over 500W wall power, so the PSU quality is important here too.
For the 3960X, we found that 1.3V with Level 7 loadline calibration resulted in 1.28-1.288V under load and manageable temperatures. This voltage got us a stable 4.25GHz, which is an improvement of around 200MHz versus stock Precision Boost 2 but a 250MHz reduction in the maximum boost clock.
We will be outlining the AMD Ryzen Threadripper 3960X and 3970X CPUs’ performance while using the ASUS ROG Zenith Extreme II motherboard. A 32GB (4x8GB) kit of 3200MHz CL14 DDR4 memory serves our test system.
Given the likely usage scenarios for such high core count HEDT processors, where a few percent additional frequency is not worth the risk of instability that could cost real projects real money, we feel that Precision Boost Overdrive is the best way to push these CPUs. Precision Boost Overdrive also maintains preferential maximum boost clocks whilst eliminating instability and overheating concerns.
As such, our performance testing will focus on both Threadripper 3000 CPUs running at their stock Precision Boost 2 settings with the Cooler Master Wraith Ripper CPU cooler and while overclocked using Precision Boost Overdrive and the +200MHz setting.
Today’s comparison processors come in the form of:
- Coffee Lake Core i9-9900K (8C16T) and Core i9-9900KS (8C16T).
- Cascade Lake-X Core i9-10980XE (18C36T).
- Skylake-X Core i9-9960X (16C32T) and Core i9-9980XE (18C36T).
- Matisse ‘Zen 2’ Ryzen 9 3900X (12C24T) and Ryzen 9 3950X (16C32T).
- Zen+ Ryzen Threadripper 2950X (16C32T), Ryzen Threadripper 2970WX (24C48T), and Ryzen Threadripper 2990WX (32C64T).
Each processor is tested at its default out-of-the-box settings. We also include reasonable overclocking performance data where relevant.
For the Intel CPUs, forced turbo is enabled by default when XMP is enabled and, in most scenarios, cannot be disabled. As such, we test using the forced turbo frequencies with the Intel processors.
All-core load frequencies for the tested chips are as follows:
- Core i9-9900KS = 5.0GHz.
- Core i9-9900K = 4.7GHz.
- Core i9-9980XE = 3.8GHz (without AVX2/AVX-512 reductions).
- Core i9-9960X = 4.0GHz (without AVX2/AVX-512 reductions).
- Core i9-10980XE = 3.8GHz (without AVX2/AVX-512 reductions).
- Ryzen Threadripper 2950X = Around 3550-3600MHz.
- Ryzen Threadripper 2970WX = Around 3350-3400MHz.
- Ryzen Threadripper 2990WX = Around 2975-3100MHz.
- Ryzen Threadripper 3960X = Around 4025MHz.
- Ryzen Threadripper 3970X = Around 3850-3900MHz.
- Ryzen 9 3900X = Around 4050MHz (AGESA 188.8.131.52ABBA).
- Ryzen 9 3950X = Around 3800-3950MHz (AGESA 184.108.40.206).
CPU Test System Common Components:
- Graphics Card: Gigabyte Aorus RTX 2080 Ti XTREME (custom fan curve to minimise thermal throttling).
- Memory: 16GB (2x8GB) G.Skill 3200MHz 14-14-14-34 DDR4 @ 1.35V (4x8GB for quad-channel systems).
- CPU Cooler: Corsair H100X with 2435 RPM SP120L fans (Cooler Master Wraith Ripper for Threadripper platforms).
- Games SSD: Aorus 2TB PCIe Gen 4 M.2 SSD & Crucial MX300 750GB SATA SSD.
- Power Supply: Seasonic Prime Titanium 1000W.
- Operating System: Windows 10 Pro 64-bit 1903 Update.
sTRX4 System (Ryzen Threadripper 3960X, Ryzen Threadripper 3970X):
- Ryzen Threadripper 3970X CPU: AMD Ryzen Threadripper 3970X ‘Castle Peak’ 32 cores, 64 threads (PBO+200MHz overclocked).
- Ryzen Threadripper 3960X CPU: AMD Ryzen Threadripper 3960X ‘Castle Peak’ 24 cores, 48 threads (PBO+200MHz overclocked).
- Motherboard: ASUS ROG Zenith Extreme II (sTRX4, TRX40, 0601 BIOS, AMD AGESA CastlePeakPI-SP3r3-220.127.116.11).
- System Drive: ADATA SX8200 480GB PCIe NVMe SSD.
- CPU Cooler: Cooler Master Wraith Ripper.
LGA 1151 Rev. 2 System (i9-9900K, i9-9900KS):
- Core i9-9900K CPU: Intel Core i9-9900K ‘Coffee Lake’ 8 cores, 16 threads (4.9GHz @ 1.3V overclocked).
- Core i9-9900KS CPU: Intel Core i9-9900KS ‘Coffee Lake’ 8 cores, 16 threads (5.2GHz @ 1.375V overclocked).
- Motherboard: Gigabyte Z390 Aorus XTREME (LGA 1151 rev. 2, Z390, F7 BIOS).
- System Drive: Plextor M9Pe 512GB PCIe NVMe SSD.
AM4 System (Ryzen 9 3900X, Ryzen 9 3950X):
- Ryzen 9 3900X CPU: AMD Ryzen 9 3900X ‘Matisse’ 12 cores, 24 threads (4.25GHz @ 1.35-1.4V overclocked).
- Ryzen 9 3950X CPU: AMD Ryzen 9 3950X ‘Matisse’ 16 cores, 32 threads (PBO+200MHz overclocked).
- Motherboard: Gigabyte X570 Aorus Master (AM4, X570, F10a BIOS, AMD AGESA Combo-AM4 18.104.22.168).
- System Drive: WD Black SN750 500GB PCIe NVMe SSD.
TR4 System (Ryzen Threadripper 2950X, Ryzen Threadripper 2970WX, Ryzen Threadripper 2990WX,):
- Ryzen Threadripper 2990WX CPU: AMD Ryzen Threadripper 2990WX ‘Colfax’ 32 cores, 64 threads.
- Ryzen Threadripper 2970WX CPU: AMD Ryzen Threadripper 2970WX ‘Colfax’ 24 cores, 48 threads.
- Ryzen Threadripper 2950X CPU: AMD Ryzen Threadripper 2950X ‘Colfax’ 16 cores, 32 threads (PBO overclocked).
- Motherboard: Gigabyte X399 Gaming 7 (TR4, X399, F12h BIOS, AMD AGESA SummitPI-SP3r2-22.214.171.124).
- System Drive: ADATA XPG SX950 240GB SATA SSD.
- CPU Cooler: Cooler Master Wraith Ripper.
LGA 2066 System (i9-9960X, i9-9980XE, i9-10980XE):
- Core i9-10980XE CPU: Intel Core i9-10980XE ‘Cascade Lake-X’ 18 cores, 36 threads (4.6GHz @ 1.165V overclocked).
- Core i9-9960X CPU: Intel Core i9-9960X ‘Skylake-X’ 16 cores, 32 threads (4.5GHz @ 1.2V overclocked).
- Core i9-9980XE CPU: Intel Core i9-9980XE ‘Skylake-X’ 18 cores, 36 threads (4.5GHz @ 1.175V overclocked).
- Motherboard: Gigabyte X299X Designare 10G (LGA 2066, X299, F3a BIOS).
- System Drive: Corsair Neutron XT 480GB SATA SSD.
- GeForce 440.97 VGA drivers.
- Cinebench R15 – All-core & single-core CPU benchmark (CPU)
- Cinebench R20 – All-core & single-core CPU benchmark (CPU)
- Blender 2.79b – All-core rendering of the BMW benchmark (CPU)
- Adobe Media Encoder 2020 – Export a 25 minute and 7 second Premiere Pro 2020 project to the YouTube 4K H.264 40Mbps preset. The Premiere Pro project features a mix of colour correction and motion adjustment, PNG or JPEG static images, and 100Mbps 4K30 H.264 A-roll and B-roll. The video is our ‘ASRock X570 Taichi Motherboard Review’ on YouTube (CPU & Memory).
- HandBrake x264 – Convert 1440p60 H264 video to 1080p60 H264 using the YouTube HQ 1080p60 preset (CPU & Memory)
- HandBrake x265 – Convert 4K30 100Mbps H264 video to 1080p30 40Mbps H265 using the H.265 MKV 1080p30 preset (CPU & Memory)
- 7-Zip v19.00 – Built-in 7-Zip benchmark test (CPU & Memory)
- SiSoft Sandra – Memory bandwidth and Cache & Memory Latency Test (Memory)
- AIDA64 – Memory bandwidth, memory latency, memory & cache latency (Memory)
- 3DMark Time Spy – Time Spy (DX12) test (Gaming)
- Ashes Escalation – Built-in benchmark tool, 1920 x 1080, Crazy quality preset, CPU-Focused Test, DX12 (Gaming)
- Deus Ex: Mankind Divided – Built-in benchmark tool, 1920 x 1080, Ultra quality preset, no AA, DX12 version (Gaming)
- Far Cry 5 – Built-in benchmark tool, 1920 x 1080, Ultra quality preset, DX12 (Gaming)
- Ghost Recon: Wildlands – Built-in benchmark tool, 1920 x 1080, Ultra quality preset, DX12 (Gaming)
- Grand Theft Auto V – Built-in benchmark tool, 1920 x 1080, Maximum quality settings, Maximum Advanced Graphics, DX11 (Gaming)
- Hitman 2 – Built-in benchmark tool – Mumbai scene, 1920 x 1080, Ultra quality preset, DX12 (Gaming)
- Shadow of the Tomb Raider – Built-in benchmark tool, 1920 x 1080, Highest quality preset, no AA, DX12 version (Gaming)
- The Division 2 – Built-in benchmark tool, 1920 x 1080, Ultra quality preset, no AA, DX12 version (Gaming)
Starting off with Cinebench R15, we see AMD continue its highly competitive performance in well multi-threaded tile-based rendering software. Both the 3960X and 3970X set new levels of performance in this application, comfortably outperforming the best Intel and previous-gen AMD HEDT competitors.
Stock versus stock, the 32-core Zen 2 part outperforms its 32-core Zen+ predecessor by an impressive 48%. 24-core versus 24-core, the 3960X wins by 39%. These victories are driven by architectural improvements for Zen 2 and, more importantly, higher operating frequencies that the efficient 7nm process enables.
Versus Intel’s $1000 flagship Core i9-10980XE, the stock versus stock performance lift is 60% for the 3960X and 98% for the 3970X. OC versus OC, AMD’s new 24-core is now 37% quicker than Intel’s i9 and the 32-core flagship is 69% better.
Single-threaded performance in Cinebench R15 is strong, just as we have seen with other Zen 2-based processors. AMD’s HEDT chips are able to boost to 4.5GHz (or slightly higher when overclocked with PBO), allowing them to throw a reasonable clock frequency at the workload. The higher clocked Intel Coffee Lake and Ryzen 3000 mainstream processors still rule the 1T chart thanks to higher operating speeds.
Compared to previous generation Threadripper, the performance increase is substantial at 22-23%. The Cascade Lake-X is beaten but its stock operating mode puts up a decent fight thanks to the lofty 4.8GHz maximum boost frequency for this single-thread test.
We see another performance demolition of the alternative processors by the new Threadripper parts in Cinebench R20. The 32-core 3970X hits almost 17,000 points out of the box, making it 52% faster than the 2990WX it replaces. The 3960X’s lead over its 24-core predecessor is 42%.
Compared to the $1000 Intel i9-10980XE, the 40% pricier AMD 3960X is 57% faster stock versus stock and 35% better OC versus OC. AMD’s 32-core flagship offers almost double the leading Intel Core i9 chip’s performance out of the box. Worth noting is the solid fight that AMD’s Ryzen 9 3950X mainstream chip puts up. Yes, the entry-level Threadripper 3000 chip is 50% faster but it also costs 87% more.
As a side note, our manual overclock to 4.25GHz with the 3960X netted a multi-core score of 14294 points, making it slightly faster than the PBO mode at the expense of power efficiency, stability concerns, and higher operating temperatures. That’s not worth it, in our opinion, but you may disagree.
As we saw with R15, Cinebench R20 single-thread performance is strong with Threadripper 3000 thanks to the Zen 2 architecture and respectable boost frequencies. To beat out either of AMD’s new HEDT flagships, you’ll be looking at a highly clocked Ryzen 3000 AM4 processor or a 5GHz+ Coffee Lake chip. Even the Core i9-10980XE and its 4.8GHz maximum single-core clock speed is handily beaten by the new Threadripper chips. And the performance gap to last year’s Threadripper Zen+ offering is significant.
As a side note, the manual 4.25GHz all-core overclock for the 3960X delivered 489 points, which is a significant decrease versus the stock and PBO results.
It is impressive to see these high core count parts offer market-leading performance in multi-threaded rendering workloads that is backed up by highly competitive single-threaded performance in Cinebench. We’re so accustomed to choosing one or the other that it’s almost bizarre to see the 24- and 32-core Threadripper processors so high in 1T testing charts. Precision Boost 2, combined with Zen 2, is clearly paying dividends.
Blender BMW Benchmark
The 3970X takes top spot, beating its next closest competitor – the 24-core 3960X – by 19% stock versus stock (which converts to 23% higher performance). Generationally, the render time decrease from the 32-core 2990WX to the 32-core 3970X is a hefty 30 seconds or 30%. That converts into 43% higher performance for the 3970X. The 3960X’s generational render time reduction versus its Zen+ predecessor is 29 seconds or 25%, which converts into 34% higher performance.
Compared to Intel’s Core i9-10980XE, even the heavily overclocked Intel part cannot match AMD’s Threadripper 3000 offerings, though AMD’s chips are more expensive. The $1400 Threadripper 3960X is 39% faster than the Core i9-10980XE at stock, which spells 65% higher performance. That gap reduces to 19% when both chips are overclocked (which means AMD’s 24-core is 24% higher performance). The flagship Threadripper 3970X delivers a render time just under half that of Intel’s Cascade Lake-X flagship. Translate that into a numerical value and AMD’s leader is delivering 103% more performance for its roughly 100% price increase.
If you do a lot of Blender rendering and make money from that work, AMD’s Ryzen Threadripper 3000 processors may justify their hefty price tags by the unparalleled performance they offer.
AMD’s Zen architecture has always been strong when it comes to file decompression, so it comes as no surprise to see the Threadripper 3000 chips delivering crushing victories. The performance improvement versus Zen+ predecessors is 80% for the 32-core 3970X and 56% for the 24-core 3960X.
Both Threadripper 3000 CPUs also comfortably outperform the Intel HEDT flagship, with AMD’s 32-core part boosted by PBO actually offering more than double the performance of Intel’s heavily overclocked 18-core.
Plentiful bandwidth and a high core count combine to deliver strong 7-Zip compression performance for Threadripper 3000, especially when compared to the 2990WX and 2970WX with their off-die memory accessibility challenges.
Versus the roughly $1000 Intel HEDT flagship, AMD’s $1400 Threadripper 3960X is 39% quicker stock versus stock and 41% OC versus OC. This is a rare situation where preferential boost clocks delivered via PBO overclocking help AMD’s 24-core chip stretch its lead versus the Core i9-10980XE and its static 4.6GHz overclock.
The 32-core 3970X is faster again, though you’d need some pretty hefty or simultaneous 7-Zip compression workloads to take advantage of its higher core count.
Adobe Media Encoder 2020
Despite Adobe Premiere Pro and Media Encoder commonly getting criticism for being poorly optimised to take advantage of high-end hardware, the Threadripper 3000 processors were actually able to offer the highest performance in our testing. We didn’t have access to the Core i9-10980XE for this test, however Threadripper 3000’s performance improvement over the Skylake-X 18-core, even when that’s overclocked to a lofty 4.5GHz, is clear.
Our H.264 YouTube 4K export isn’t demanding enough to saturate the 24-core or 32-core chips, as proven by the minor performance differences between the 3960X and 3970X. That leaves CPU cycles spare to work on other tasks, such as Lightroom photo editing or Handbrake video conversion, while also exporting the Adobe Premiere project.
We also see a solid performance improvement from the 16-core Ryzen 9 3950X to the 24-core Threadripper 3960X, and that may be important if you do a lot of high-value video editing in Premiere.
No memory or latency issues with these Zen 2 Threadripper processors were observed; the performance jumps over the 2970WX and 2990WX are colossal.
Even with Handbrake’s inability to use all the CPU resources in this H.264 conversion test, both Threadripper 3000 chips lay their claim to top spot in our chart.
Stock versus stock, the 3970X beats its 2990WX predecessor by 92% and Intel’s i9-10980XE by 54%. The lower performance for Threadripper 2000 WX parts is tied to their unique memory controller topology that penalises some cores with latency penalties for accessing resources that are off-die. The central IO chiplet design for Zen 2 and Threadripper 3000 clearly alleviates all such issues for Handbrake.
Intel’s HEDT parts come back with a strong fight when their sizeable frequency headroom is leveraged. With the i9-10980XE Overclocked to 4.6GHz, the Threadripper 3970X running PBO has its performance lead cut to 9%, while the overclocked 3960X only maintains a gap of +8%.
Intel’s Skylake-based Core architecture and robust AVX implementation clearly play a large role in offsetting significant core count deficits in Handbrake. That’s especially true when the frequency shackles and AVX-based clock speeds reductions seen at stock conditions are removed. AMD still maintains a performance lead in this test, but Handbrake cannot use all 24 or 32 cores meaning that the Threadripper chips struggle to fully flex their muscle versus more affordable Intel parts.
And when we see overall resource utilisation drop even further on high core count chips in our Handbrake x265 workload, Intel’s higher frequency headroom and robust AVX performance deliver the win.
Granted, with all chips at stock, the 3970X and 3960X are faster than Intel’s 18-core flagship. Remove the AVX-induced speed reductions by overclocking the Intel HEDT part and it enjoys a performance lead of 25% over AMD’s $2000 flagship.
One key difference to point out here is overall resource utilisation, with the higher core count Threadripper 3000 parts understandably having more free resources to do other tasks, such as simultaneous video conversions… or gaming, if you’d prefer. However, if your workload only requires single-item video conversation and you cannot stack multiple tasks simultaneously, the 24 and 32 cores offered by Threadripper 3000 are clearly overkill. At least the performance increase over Threadripper 2000 WX parts is massive thanks to the improved IO topology.
Sandra Memory Bandwidth
Memory bandwidth registers at over 60GBps with our 3200MHz C14 DDR4 sticks in quad channel. One interesting trend that we observed for both Threadripper 2000 WX and Threadripper 3000 is that the 24-core part consistently scores higher in the SiSoft test than its 32-core sibling. This is an odd, and repeatable, observation and could be related to the way in which the benchmark interacts with the memory on a per-core basis.
Either way, you get a healthy serving of memory bandwidth from Threadripper 3000 without any clear headaches caused by the centralised IO die and Infinity Fabric interconnects.
AIDA64 Memory Performance
Memory latency is, however, an area where the central IO die seems to provide a performance penalty versus Intel HEDT and Threadripper 2000 WX processors. If you have workloads that are particularly sensitive to memory latency, perhaps such as system transactions for high-frequency trading, Threadripper’s higher latency versus even the Intel HEDT Cascade Lake-X flagship is important to note.
Higher cache capacities may reduce the negativity of this in real workloads, but the latency increase to system memory still exists. We’ll have to see if this latency penalty has a significantly negative impact on gaming performance.
Ashes of the Singularity Escalation is a well multi-threaded DX12 title that features a CPU-focused benchmark test. We use the game’s built-in benchmark with quality set to Crazy.
Deus Ex: Mankind Divided
Despite its age, Deus Ex: Mankind Divided remains a demanding title even for modern hardware. We use the game’s built-in benchmark with quality set to Ultra, MSAA disabled, and DX12 mode.
Far Cry 5
We use the Far Cry 5 built-in benchmark with quality set to Ultra.
Grand Theft Auto V
Grand Theft Auto V remains an immensely popular game for PC gamers and as such retains its place in our test suite. The DX11-built game engine is capable of providing heavy stress to a number of system components, including the GPU, CPU, and Memory.
We run the built-in benchmark using a 1080p resolution and generally Maximum quality settings (including Advanced Graphics).
It is clear to see that the Ryzen Threadripper 3960X and 3970X are both perfectly competent gaming processors. That comes as no real surprise given that the central IO die avoids memory and PCIe accessibility headaches that plagues Threadripper 2970WX and 2990WX gaming numbers.
Overall, you’ll get performance that is typically a bit slower than the mainstream Ryzen 3000 processors and their generally higher operating frequencies. Of course, that’s when the gaming scenario is not GPU limited and the CPU performance differences are observable, such as high refresh rate 1080P gaming.
The Far Cry 5 result was lower than one would expect, with this title showing particularly sporadic performance from boot to boot. An easy way to fix this is to apply Game Mode within Ryzen Master, reboot the system as required, and drop down to 8-core, 16-thread operation. This boosted our Far Cry 5 FPS numbers on the 3970X to 113 average at 1080P.
Ghost Recon Wildlands
We run the built-in benchmark using a 1080P resolution and the Ultra quality preset.
We run the built-in benchmark using the Mumbai scene with image quality set to Ultra and the DirectX 12 mode enabled.
Shadow of the Tomb Raider
We run the built-in benchmark using the DirectX 12 mode, anti-aliasing disabled, and the Highest quality preset.
The Division 2
We use the game’s built-in benchmark with quality set to Ultra, VSync disabled, and DX12 mode.
Compared to the Intel HEDT flagship Core i9-10980XE, Threadripper 3000 generally loses out by a few percentage points for average FPS. The margins are small, but Intel’s HEDT chip does beat out Threadripper more often than not. Game Mode also fixes Hitman 2 performance by increasing the average FPS to 110.
Clearly, Game Mode is not required to deliver solid gaming performance in most titles, as was the case with the Threadripper 2000WX. It is now more of a troubleshooting feature that can be used when the odd game here-and-there decides it doesn’t want to play nice with these high core count chips.
Either way, there’s no reason why Threadripper can’t be used to deliver a smooth gaming experience that is comparable to the mainstream Ryzen processors. That is, unless you’re counting every last frame in your game, of course. We’d say that perfectly good gaming performance with very few real areas for concern is a nice bonus for $1400 and $2000 HEDT chips.
The market for people buying an expensive CPU and using it for gaming at 1080p is likely to be limited. What 1080p does is give a good indication of the CPU’s raw gaming performance as GPU power is sufficient to push frame rates to a level where the CPU and memory limitations can be observed.
We supplement the 1080p gaming results with three games tested at 2560×1440 resolution. We chose Deus Ex: Mankind Divided, Far Cry 5, and The Division 2. Deus Ex and The Division 2 are particularly GPU heavy at higher resolutions and Far Cry 5 is a (relatively) computationally-heavy, open-world game.
Deus Ex: Mankind Divided
We run the built-in benchmark using a 2560×1440 resolution and the same settings as the 1080p test (Ultra preset).
Far Cry 5
We run the built-in benchmark using a 2560×1440 resolution and the same settings as the 1080p test (Ultra preset).
The Division 2
We run the built-in benchmark using a 2560×1440 resolution and the same settings as the 1080p test (Ultra preset).
1440P performance shows much of the same trend, with AMD’s Game Mode in Ryzen Master being equally capable of boosting Far Cry 5 performance to 110 FPS average.
We leave the system to idle on the Windows 10 desktop for 10 minutes before taking a power draw reading. For CPU load results, we read the power draw while producing approximately 5 minutes worth of runs of the Cinebench R20 multi-threaded test. We also run the Blender Classroom Rendering stress test.
Both Cinebench and Blender are used instead of synthetic stress tests such as AIDA64. This is because some CPUs – most notably Intel’s HEDT Core processors when operating under default turbo conditions – will heavily reduce their clock speed with the AVX-based AIDA64 workload, thus giving an unrepresentative reading.
The power consumption of our entire test system (at the wall) is shown in the chart. The same test parameters were used for temperature readings.
Power draw readings are accurate to around +/-5W under heavy load due to instantaneous fluctuations in the value. We use a Titanium-rated Seasonic 1000W Prime PSU (with 8-pin plus 4-pin or 8-pin plus 8-pin power connectors where possible). We do not yet have full data for our new Blender Classroom reading for all comparison CPUs.
We know that the combination of AMD’s Zen 2 architecture and TSMC’s 7nm process technology can make for an efficient pairing, but Threadripper’s focus is performance across many cores.
The 280W TDP chips demand around 410W system-wide power from the wall when running Cinebench R20, and just under 400W for the Blender Classroom render. This puts their power consumption numbers higher than the 250W TDP 2970WX and 2990WX they replace, albeit with significantly higher operating clocks.
Compared to Intel’s 36-thread Core i9-10980XE operating in its 3.8GHz multi-core turbo mode that is applied when XMP is enabled, both Threadripper parts command around 80W more system-wide power. That’s quite a sizeable chunk of additional energy, though it must be viewed with reference to AMD’s higher Cinebench performance, also.
Overclocking the HEDT chips forces Intel’s 18-core parts to the bottom of the chart, with our 4.6GHz Core i9-10980XE sample registering over 500W system-wide power draw running Blender. The Threadripper 3000 chips weren’t far behind, with their Cinebench numbers hitting around 480W system-wide and software reporting 345-365W CPU package power draw.
Put simply, AMD’s Threadripper 3960X and 3970X demand a lot of power which puts emphasis on a quality motherboard VRM, CPU cooler, and power supply. And that’s without factoring in a chassis that can happily flush away more than 400W of heat.
Performance Per Watt – Cinebench R20 nT
Factoring in performance for those sizeable power requirements, both Threadripper chips show the superb efficiency that we have come to expect from Zen 2 and 7nm. The power efficiency improvements for Threadripper 3000 versus Intel’s cheaper Cascade Lake-X flagship are sizeable in Cinebench. Even the 24-core 3960X registers as higher efficiency than the Core i9-10980XE. And that AMD chip is likely to use lesser binned silicon and spends more of its power budget chasing frequency, rather than sitting within an operating window for peak efficiency.
While electricity prices generally aren’t significant enough to sway an individual’s purchase between CPUs, that thought process may change for HEDT chips such as these that will be put to use inside workstations. Perhaps a design consultancy has a team of 50 engineers, all of whom push their workstations hard on a daily basis. That logic starts to place more emphasis on the value of the compute efficiency offered by Threadripper 3000.
Getting more work done per unit power is rarely a bad thing.
Performance Per £ (price efficiency) – Cinebench R20 nT
Based on retail prices in the UK, as of week commencing 18th November 2019, we can see that the performance per £ for Threadripper 3000 is pretty low, given the high asking prices. This is stark contrast to AM4 Ryzen 3000 that generally offers superb performance value using our Cinebench data.
Clearly, AMD is charging a price premium for such high core count processors. This chart also doesn’t factor additional value-added features, such as PCIe lanes and memory capacity, all of which command a price premium but do not contribute to Cinebench scoring.
Whichever way you look at it, AMD’s Ryzen Threadripper 3000 processors offer highly competitive performance, but AMD most certainly expects you to pay appropriately for that privilege. You’d have to be in a situation where increased compute performance brings more profit to your business in order to offset the reduced performance per £ of the Threadripper 3970X and 3960X.
Temperature recordings were taken using a Cooler Master Wraith Ripper TR4-specific air cooler. Ambient temperatures were around 24°C. We do not yet have full data for our new Blender Classroom reading for all comparison CPUs. Competing CPUs without the asterisk in the chart were tested with a Corsair H100X 240mm AIO and are therefore not directly comparable.
We use the Cooler Master Wraith Ripper air cooler as its base plate is designed specifically to provide full coverage on Threadripper. We also feel that air cooling makes a lot of sense for HEDT workstation parts such as these, where the added potential points of failure from liquid cooling perhaps do not justify any additional performance that may be obtainable.
Despite the toasty 280W TDP for both Threadripper 3000 CPUs, their operating temperatures at stock are very reasonable. With stock temperatures at around 80C or below, there is clearly headroom to push the chips a little further by opening up the power budget.
Precision Boost Overdrive feeds extra power and elevates the clock speeds, but this starts to put us into territory where we’re paying closer attention to temperatures, especially with the 24-core 3960X’s 93C result.
Staying below 95C, even when overclocked, proves that you don’t need to break the bank when budgeting for cooling of these 24- and 32-core processors. Our £120 Wraith Ripper did a solid job, but we’d wager that Noctua’s £75 U14S TR4 would deliver equally stellar cooling.
With the Ryzen Threadripper 3970X, AMD has taken the performance crown in rather convincing fashion, beating out Intel’s 18-core flagship in heavily multi-threaded, and even many lightly threaded, workloads. And the performance issues that made many potential buyers hesitant with last year’s Threadripper 2990WX 32-core and 2970WX 24-core have been basically eliminated.
AMD’s core chiplet and central IO die design for Zen 2 really has been a stroke of genius that has allowed performance to scale across the Ryzen 3000 and Threadripper 3000 lines, without the performance penalties previously seen for Zen+ Threadripper flagships. Even the $1400 24-core Ryzen Threadripper 3960X serves up hefty beatings for Intel’s Core i9-10980XE in many workloads, albeit with a 40% price premium.
Put simply, Threadripper 3000 performs extremely well in tasks that can handle the large number of cores and threads. Tile-based rendering in Cinebench and Blender work superbly. Handbrake encoding for our x264 workload, where many cores can be leveraged, delivers strong performance. And the 7-Zip compression and decompression results are outstanding.
With that said, our Handbrake x265 and Adobe Media Encoder workloads show that there are still situations where it is difficult to leverage all 48 or 64 threads, even for what are generally perceived as demanding tasks. Situations such as this can be alleviated if you can stack work simultaneously, such as Handbrake converting your video to x265 whilst also exporting 4K YouTube content through Adobe Media Encoder. That’s a smart use case for these high core count chips that are difficult to saturate with conventional workloads.
Precision Boost 2 means that both Ryzen Threadripper 3000 chips deliver most of their performance out of the box, provided you have a good motherboard and CPU cooler. Overclocking headroom is limited unless you have extremely high-end cooling, though Precision Boost Overdrive does deliver minor gains with very little downside.
Intel, however, still maintains an advantage when it comes to maximum frequency for all-core loads, and that’s thanks to its refined 14nm process technology. Pushing the Core i9-10980XE to 4.6GHz allows it to offset some of its thread count deficit, and the high clock speeds prove beneficial in situations such as Handbrake video conversion, where not all of the cores can be completely utilised on Threadripper parts.
With that said, even at a lofty 4.6GHz clock speed, the Intel Cascade Lake-X flagship is realistically in a lower performance tier for workloads that utilise many threads efficiently.
AMD’s transition to a new socket will, no-doubt, annoy users who have invested heavily in an X399/TR4 motherboard. However, the sTRX4 socket is brought online to allow for the platform to scale to 64 cores in the near future and to provide elevated TDPs that minimise the constriction of Threadripper 3000 boost clocks. You also get the improved PCIe Gen 4 8-lane pipe to the feature-heavy TRX40 chipset, and I would expect the backwards compatibility headaches seen with Ryzen 3000 on AM4 to be avoided by launching new motherboards for Threadripper.
Yes, it’s highly annoying that buyers will need to lay out more cash on an expensive TRX40 motherboard, especially when Intel’s X299 HEDT approach is a stark contrast. But I feel that the aforementioned reasons are worthy of the compromise to one’s wallet. This isn’t a change of platform for the sake of it – you’re getting more performance and functionality from a new motherboard and the new platform.
Threadripper and the TRX40 platform is expensive and AMD knows it. AMD is offering extreme compute performance via ludicrous high core count parts that Intel’s HEDT platform cannot compete with, so AMD prices the platform accordingly. TRX40 with Threadripper 3000 is clearly the most premium HEDT/workstation platform. It’s not cheap, but if you need or can justify the compute performance, memory support, and superb IO capability, Intel’s HEDT competition cannot really compete even with more aggressive pricing for X299.
If time really is money for your demanding, heavily multi-threaded workloads, AMD’s new Threadripper 3000 processors are the best options on the market.
The MSRP for the 24-core Ryzen Threadripper 3960X is set at $1399 USD, which AMD has confirmed should be £1349 in the UK. The MSRP for the 32-core Ryzen Threadripper 3970X is set at $1999 USD, which AMD has confirmed should be £1899 in the UK.
Discuss on our Facebook page HERE.
- Extreme multi-threaded performance.
- High single-core performance.
- Lofty all-core frequencies out of the box, thanks to Precision Boost 2.
- Plentiful PCIe Gen 4 connectivity.
- High-bandwidth CPU-to-TRX40 chipset link.
- Reasonable power consumption and operating temperatures.
- Strong power efficiency.
- A whole new tier of HEDT/workstation performance to users who require it.
- Expensive, even in light of the performance on offer.
- Some people will be disappointed by no TR4/X399 backwards-compatibility.
KitGuru says: AMD’s Ryzen Threadripper 3970X and 3960X are clear market leaders and are basically unmatched in terms of heavy multi-threaded compute performance. If you have demanding multi-threaded workloads where higher performance will bring you better outcomes, whether that’s more free time or increased profits, AMD’s Ryzen Threadripper 3000 CPUs are exactly where your purchasing radar should be focussed.