Over the years, there have been many famous ‘app detection’ revelations. Probably the most famous was revealed by Beyond3D when genius uber-geeks discovered that changing the name of the 3DMark 2003 executable file created a huge difference for the performance of nVidia cards. Looking at recent documents posted into the KitGuru forum, we have another cause for investigation on our hands. If it’s a lie, then it is a very clever one, and we will work hard to find out who perpetrated it. If it’s the truth, then it’s a very worrying development. KitGuru powers up a robotic Watson and takes it for a walk on the Image Quality moors to see if we can uncover any truth implicating a modern-day Moriarty (of either persuasion).
Quick backgrounder on image quality, so we all know what we’re talking about.
Journalists benchmark graphic cards so you, the public, can make a buying decision. So far, so simple.
The first rule is to use the right tests. There is no absolute right and wrong, but if you remember how far Bob Beamon jumped in the Mexico Olympics in 1968, you will see that his record lasted for 23 years simply because he set the record at high altitude. That’s a case where someone ended up with an advantage, without any malice whatsoever. All jumpers were at the same altitude and the jump was amazing anyway.
When Ben Johnson took gold for Canada at the 1988 Olympics in Seoul, he ran straight past favourite Carl Lewis and appeared to take the world record for 100 metres with a stunning 9.79 seconds. He had used steroids. No one else was on drugs. The gold medal and world record were stripped.
So, two key things to look for with benchmarking are:-
- Is the test itself fair for everyone ?
For example, does the test use a lot of calls/functions etc that only one card has – and which are not found in the majority of games that a consumer is likely to play. Every year, 3DMark is late because one GPU vendor or the other is late, and Futuremark don’t want to go live unless they have given both sides a fair shout to deliver next-gen hardware
- Is everyone doing the same amount of work ?
Are short-cuts being taken which mean that the test will automatically score higher on one card than another. If an improvement works in all games/apps, then that is obviously a benefit. However, detecting an application and identifying a way to increase your scores by dropping image quality (in such a way that gamers can see it) is downright sneaky. You could buy a card thinking it’s faster than it really is, because it’s driver has been tweaked for a benchmark.
The arguments around ’1′ and ’2′ focus on things like ‘If nVidia has more tessellation capability than AMD, then what level of tessellation should be tested ?’ and ‘If you want the result for a light calculation, where one side stores all possible outcomes in a table and looks them up, while another GPU calculates the value as it goes – then which is correct ?’. The second was uncovered a few years back in Quake, where it became clear that nVidia’s look up function was faster than its calculation ability – so both companies took different paths to the same result. The tessellation argument will probably be solved by Summer 2011 when a lot more DX11 games are out and using that function. We’ll then know what constitutes a ‘reasonable load’ in a game.
This weird and wonderful technique uses calculated blurring to make images seem smoother. Totally counter intuitive, but it works a treat. Why counter-intuitive? Because, in the real world of limitless resolution/detail, when we want to draw the best line possible, we normally try to avoid blurring at all.
Proper AA can also use a ton of memory. At the simplest level, instead of looking at one dot and saying ‘is it black or is it white’, you sample dots in a region and set levels of grey. While at the lowest level, it can seem messy, the overall effect is that the lines themselves appear smoother. Which is a good thing. There are different ways to do the sampling. There are also different levels. For most modern gaming benchmarks, people tend to set good quality AA to 4x.
Telling the difference
It’s harder to see in a still image, but when a game is running, poorer quality AA will make it seem that the edges of things are ‘crawling’. It can have a negative effect on your ability to play. For example, if you’re running past trees, looking for enemies to shoot, then your eyes are really sensitive to small movements. When a branch twitches, is it an enemy preparing to shoot or is your graphic card failing to deliver decent AA ?
In a still, it is easiest to tell when you create a simple animation of one image on top of the other and just swap between them (see below).
‘Evidence’ posted on the KitGuru forum
High end cards are often tested using 30″ panels with a 2560×1600 resolution. We have had a lot of ‘documents’ posted recently that appear to show a difference in AA quality on second generation nVidia cards like the GTX570 and GTX580. Lowering the image quality when running AA can give a card a boost. The stuff we’ve seen, seems to show a boost of 8%. That’s significant. But is it real? We’re going to go deep into the validity of these claims and we invite nVidia to confirm categorically that press drivers for GTX570 (launching next week) will NOT include a lowering of AA quality to get a boost in games like HAWX.
Following the launch of the awesome GTX460, nVidia’s Fermi series really seems to have come into its own and it would be a real shame if something strange was being done in the driver. We’re firmly camped in the ‘let us hope not’ area. Plus, we’re happy to test this exhaustively and report the truth.
Is there a difference and is the difference real?
OK, so enough preamble, what is it we’re really talking about? Below is a simple animated GIF that (according to the posts we have received) seems to show that the anti-aliasing image quality (IQ) drops off for the GTX570/580 is the driver detects that HAWX (commonly used for benchmarking) is running. Looking along the plane’s edge, one of the images definitely appears to show more detail. How was this achieved? By renaming the application, we’re being told. It could all be an elaborate ruse, but if benchmarks are being detected to allow image quality to be dropped and benchmark scores to be raised, then that’s pretty serious stuff. The animated GIF will take a few seconds to load. We recommend that you let it roll through a few times and you’ll see that more detailed sampling seems to be done when the same program is called HACKS than HAWX.
If these shots tell the whole truth and you had to write this ‘logic’ into a sentence (that anyone could understand), it would say “If you’re being asked to run a game that’s commonly used as a benchmark, then do less work”. Have a look and tell us if you can see less sampling when the drivers detect HAWK and not HACKS.
KitGuru says: This is a very dramatic story. It is something that we have not seen for quite some time. If it’s a hoax, then we’re going to make sure everyone knows that this is a forgery. If it’s real, then it raises serious questions. Either way, we will know shortly. In the meantime, we invite Nick Stam or someone from the GeForce driver team to contact us with an explanation and we will make updates/evolve the story accordingly. Let’s all search for the truth. We’ve heard that it’s out there.
TRUTH UPDATE: Please click here for a reply from nVidia on this subject
Comment below or in THIS existing thread of the KitGuru forum.