AMD was out of the DX11 stable more than 6 months before nVidia. It has now shipped and sold more than 25 million Direct X 11 graphics solutions. In addition to being first for DX11, Eyefinity is also a key focal point as the cards on test today can support between 3 and 6 displays via a single card … later on we will test some current games over a 3 screen setup. We’d love to have had a single card head-to-head across multiple screens, but nVidia are still trying to catch AMD in this regard and can’t yet support more than a pair of monitors on a single card. We expect nVidia to rectify this in its next generation design, the GTX600 series.
KitGuru has seen documentation that suggests the Radeon HD 6950 will come in around the £220 mark and the full blown Radeon HD 6970 will be around £280 inc vat at today’s rate.
That sees a straight line up between the GTX570 and the Radeon HD 6970 (more on that in a future review). At the same time, the Radeon HD 6950 will cost slightly more than an overclocked GTX460.
There is a way to line the GTX580 up against the Radeon HD 6970 – and that’s in a multi card configuration. Buying a pair of GTX580 cards will cost you about the same as three Radeon HD 6970 cards. So if you have around £840 to spend on graphics, both companies have a solution for you.
Now that we have the pricing and market positioning out of the way, lets have a look at the architecture.
When the HD6900 series was in development, AMD’s goal was to enhance compute architecture and geometry performance while maintaining high levels of power efficiency. AMD have refined the VLIW4 architecture and have a dual graphics engine design with asynchronous dispatch.
This new configuration allows for up to 24 SIMD engines and 96 texture units with rendering improvements, including enhanced anti aliasing performance. This is all aided by a 256 bit GDDR5 memory interface.
All stream processing units now have an equal share as the T unit design has been made redundant. Special functions (transcendentals) now occupy 3 of 4 issue slots. Greater utilisation than the previous VLIW5 design offers a 10% improvement in performance per mm2 and a simplified scheduling and register management.
The upgraded rendering back end system allows for Coalescing of write ops and the 16 bit integer (unorm/snorm) ops are twice as fast as before. The 32 bit FP (single/double component) ops are between twice and four times as fast.
The new asynchronous dispatch architecture design means that multiple compute kernels can be executed simultaneously. Each of these kernels has its own command queue and protected virtual address domain. DMA engines are dual bidirectional for faster system memory reads and writes. Flow control has been further improved with this design and there are now faster double precision operations.
The 6900 platform also brings AMD’s 8th generation tessellator to the fore – this is a dual rate geometry configuration with off chip geometry buffering.
The dual graphics engines can now process two primitives per clock with tile based load balancing and a 2x transform and backface cull rate. With dual rasterisers and up to 32 pixels per clock, this new tessellation performance is up to three times better than the HD5870 – the last class leader.
AMD are also offering new Anti Aliasing (EQAA) modes with their newest driver. New MSAA modes with up to 16 coverage samples per pixel. Additionally the number of colour and coverage samples can be independently controlled offering better quality with the same memory footprint. They are compatible with Adaptive AA, Super Sample AA and Morphological AA.
Morphological Anti Aliasing is a post process filtering technique which is accelerated by DirectCompute. It delivers full scene anti aliasing and is not limited to polygon edges or alpha tested surfaces. It bring speed benefits to the front, when compared against super sampling. The performance is similar to edge detect CFAA but it applies to all edges. The best thing is that it is compatible with any DirectX 9/10/11 application including games with no AA support and is simply enabled by Catalyst Control Center.
AMD have a sample of the modes in action (see above) and while it perhaps isn’t the best example, it gives a rough approximation of the edge improvements.
They have introduced new PowerTune Technology, which is able to lock a GPU TDP output level to a pre determined level. An integrated control processor monitors the GPU activity in real time and it dynamically adjusts the clock to enforce the TDP.
This means that the user has a direct control setting over the GPU power draw, rather than just indirect clock/voltage tweaks. AMD state that their system provides an algorithmic approach to help ensure consistent performance across the product range.
The settings are available in the AMD Overdrive tab in CCC and it ties in with overclocking tools, allowing users to increase board power limis and also (just as importantly) to decrease limits for improved power draw and thermals in applications that demand high performance.