NVIDIA To Bring DXR Ray Tracing Support to GeForce 10 & 16 Series In April
During this week, both GDC (the Game Developers’ Conference) and GTC (the Game Technology Conference) are happing in California, and NVIDIA is out in force. One of the announcements today surrounds the support of NVIDIA’s newest technologies on older graphics cards.
When Microsoft first announced the DirectX Raytracing API just over a year ago, they set out a plan for essentially two tiers of hardware. The forward-looking plan (and long-term goal) was to get manufacturers to implement hardware features to accelerate raytracing. However in the shorter term, and owing to the fact that the API doesn't say how ray tracing should be implemented, DXR would also allow for software (compute shader) based solutions for GPUs that lack complete hardware acceleration. In fact, Microsoft had been using their own internally-developed fallback layer to allow them to develop the API and test against it internally, but past that they left it up to hardware vendors if they wanted to develop their own mechanisms for supporting DXR on pre-raytracing GPUs.
NVIDIA for their part has decided to go ahead with this, announcing that they will support DXR on many of their GeForce 10 (Pascal) and GeForce 16 (Turing GTX) series video cards. Specifically, the GeForce GTX 1060 6GB and higher, as well as the new GTX 1660 series video cards.
Now, as you might expect, raytracing performance on these cards is going to be much (much) slower than it is on NVIDIA's RTX series cards, all of which have hardware raytracing support via NVIDIA's RTX backend. NVIDIA's official numbers are that the RTX cards are 2-3x faster than the GTX cards, however this is going to be workload-dependent. Ultimately it's the game and the settings used that will determine just how much of an addiitonal workload raytracing will place on a GTX card.
The inclusion of DXR support on these cards is functional – that is to say, its inclusion isn't merely for baseline compatibility, ala FP64 support on these same parts – but it's very much in a lower league in terms of performance. And given just how performance-intensive raytracing is on RTX cards, it remains to be seen just how useful the feature will be on cards lacking the RTX hardware. Scaling down image quality will help to stabilize performance, for example, but then at that point will the image quality gains be worth it?
Under the hood, NVIDIA is implementing support for DXR via compute shaders run on the CUDA cores. In this area the recent GeForce GTX 16 series cards, which are based on the Turing architecture sans RTX hardware, have a small leg up. Turing includes separate INT32 cores (rather than tying them to the FP32 cores), so like other compute shader workloads on these cards, it's possible to pick up some performance by simultaneously executing FP32 and INT32 instructions. It won't make up for the lack of RTX hardware, but it at least gives the recent cards an extra push. Otherwise, Pascal cards will be the slowest in this respect, as their compute shader-based path has the highest overhead of all of these solutions.
One interesting side effect is that because DXR support is handled at the driver level, the addition of DXR support is supposed to be transparent to current DXR-enabled games. That means developers won't need to issue updates to get DXR working on these GTX cards. However it goes without saying that because of the performance differences, they likely will want to anyhow, if only to provide settings suitable for video cards lacking raytracing hardware.
NVIDIA's own guidance is that GTX cards should expect to run low-quality effects. Users/developers will also want to avoid the most costly effects such as global illumination, and stick to "cheaper" effects like material-specfic reflections.
Diving into the numbers a bit more, in one example, NVIDIA showed a representative frame of Metro Exodus using DXR for global illumination. The top graph shows a Pascal GPU, with only FP32 compute, having a long render time in the middle for the effects. The middle bar, shwoing an RTX 2080 but could equally be a GTX 1660 Ti, shows FP32 and INT32 compute working together during the RT portion of the workload and speeding the process up. The final bar shows the effect of adding RT cores to the mix, and tensor cores at the end.
Ultimately the release of DXR support for NVIDIA's non-RTX cards shouldn't be taken as too surprising – on top of the fact that the API was initially demoed on pre-RTX Volta hardware, NVIDIA was strongly hinting as early as last year that this would happen. So by and large this is NVIDIA fulfilling earlier goals. Still, it will be interesting to see whether DXR actually sees any use on the company's GTX cards, or if the overall performance is simply too slow to bother. The performance hit in current games certainly favors the latter outcome, but it's going to be developers that make or break it in the long run.