The Evolution of 3D Graphics in Video Games (The Hunt for Photorealism)
A colleague shared this story about his first time playing a game on the PS5 – he had downloaded God of War and had just started to play the initial sequence when his father, who happened to be watching, exclaimed, “Why do they make movies any more when games look so realistic?.”
God of War (2018) looks stunning, but it is not exactly photorealistic. However, the reaction of our colleague’s father suggests that today’s major games could, perhaps, pass for live action to an untrained eye – clearly, games have come a very long way from the first 3D titles of the late 1990s.
In this blog we will delve into the major milestones in the development of 3D game graphics, which have evolved from very basic and primitive effects to the near-photorealistic visuals we enjoy today. We will discuss the advent of true 3D games and then delve into the evolution of how 3D games are rendered – or how developers kept pushing the envelope to make games look increasingly realistic, immersive and believable.
Before the advent of true 3D, games such as Doom (1993) and its numerous clones had faked the illusion of 3D using 2D game objects. The enormous success and technical innovations of Doom soon led to full 3D games – developers were keenly aware that gamers wanted true 3D titles and worked ceaselessly to create such experiences by designing and using new game engines.
The Advent and Rise of 3D Games
3D gaming rose to prominence not only because developers were striving to create 3D-capable engines – hardware manufacturers were coming up with the first true graphics cards and the developers of games for the Nintendo 64 (1996) and the Sony PlayStation (1994) were trying to make true 3D games that would reach mainstream audiences. All these factors conspired to make 3D gaming prominent by the late 1990s.
Hardware Acceleration Reaches Consumers
Hardware acceleration is a process by which certain workloads are offloaded to specialised hardware capable of parallel processing, which can execute these demanding tasks more efficiently than a software application running on the CPU.
Early graphics cards were designed to support hardware acceleration for video game rendering and one of the first successful cards of this type was the Voodoo 1, made by the company 3dfx and launched in late 1996. By the end of 1997 it was the most popular card among developers and consumers, though 3dfx soon declined with the ascent of Nvidia, which would buy 3dfx Interactive in 2000.
Tech Pioneers Start Making True 3D Games
The first 3D games were the result of unceasing innovation by a handful of brilliant programmers at id Software and Epic Games. At id, John Carmack spearheaded the creation of the Quake engine in 1996, which featured real-time 3d rendering and support for 3D hardware acceleration. The engine used static light maps for stationary objects and environment, while moving bodies such as the player character and enemies had dynamic shaders. Tim Sweeney of Epic Games introduced 3D graphic effects way ahead of their time with his Unreal Engine, which used clever tricks to simulate soft shadows, volumetric fog, dynamic lighting and more.
5th-Gen Consoles and Mainstream 3D Gaming
Advancements were not just restricted to PC hardware and games – consoles also gave a major push to the emergence of games with 3D graphics. The Nintendo 64’s hardware architecture powered true 3D games such as Super Mario 64 (1996) and The Legend of Zelda: Ocarina of Time (1998), and the PlayStation also had great looking 3D games such as Gran Turismo (1997), a racing game that uses full 3D environments. Like the Nintendo 64, the PlayStation used custom hardware to make 3D graphics possible, and the enormous success of the PlayStation, the first console to sell more than a 100 million units, propelled 3D games into the mainstream.
By the late 1990s, both PC hardware and consoles were capable of supporting 3D games and there had been a decisive shift toward 3D gaming. The next challenge was to make such games look as realistic as possible. Both Carmack and Sweeney had experimented with various rendering techniques to make their blocky 3D games look more realistic, but these titles are a far cry from what we see today. Since the late 1990s, developers have continued to push 3D game rendering toward photorealism, and this endeavour continues to this day. This blog is hence a history of advancements in 3D rendering – it ends with real-time ray-tracing, but only time will tell what new avenues game developers will explore.
How Game Graphics Evolved Towards Realism
When developers strive for photorealism, they use every tool at their disposal to achieve it. In the following sections, we discuss the key innovations in game graphics, and how each of them dramatically increased the realism of 3D game rendering.
Normal Mapping – Detailing Optimised Game Models
Every 3D model (or mesh) in a game is composed of triangles, and will have generally gone through several iterations of optimisation to reduce its ‘polycount’, i.e, the number of triangles (or polygons) it has. During the design stage, however, high-poly models, containing lots of details, are created using 3D design tools such as Max, Maya, ZBrush and others. Such high-res models can contain more than a million polygons and simply cannot be deployed in-game – the renderer would choke – but their details are required to make a scene believable. This is where normal mapping comes in – through a process known as baking, the detail of a high-res model is transferred to a ‘map’ (short for bitmap), or texture, which the game engine can use to give an optimised model the illusion of detail. Such normal maps can also convincingly mimic how the high-polygon model would respond to lighting, furthering the illusion that you are seeing a detailed in-game object, and not an optimised mesh linked to a normal map. The key benefit of a normal map is not just that it creates the impression of detail, but also that it creates this with highly-optimised, game-ready geometry.
Nvidia’s GeForce 3 was the first card to support textures such as normal maps and specular maps – the former gave models a detailed appearance, and the latter controlled how shiny or glossy the model would look. A custom-made version of this card was used in Microsoft’s first Xbox, which used normal mapping extensively. The PS2 didn’t include support for such texture-mapping, but by the seventh generation of consoles, normal mapping was the norm across PCs and consoles.
Implementing normal mapping and other textures in games was a major breakthrough, and normal maps are used to this day in games and other graphics pipelines. Today’s graphics processing units (GPUs) are capable of rendering far more polygons, but this capability is used in conjunction with normal mapping to make ultra-realistic scene assets.
The Transition to HD Gaming
HD-TV is an advancement in digital display technology that became available to consumers in the early 2000s and became widespread within a few years. The resolution of digital TV sets and monitors are measured by the number of pixels on the display, and in the early years of HD-TV, display resolutions ranged from 720p (921,600 pixels) to 1080p (over 2 million pixels).
Seventh-generation consoles such as Sony’s PS3 (2006) and Microsoft’s Xbox 360 (2005) supported both HD gaming and HD video playback. In 2005, Microsoft exec J Allard touted the Xbox 360 as the console that would usher in a new era of HD gaming, even though gaming in high resolutions had been possible on PCs for many years before these consoles, thanks to the power of dedicated PC graphics cards.
However, it was the consoles, which could support gaming and double as home entertainment systems, that made HD gaming mainstream. Sony remastered many of its PS2 games to run on HD screens with the PS3. Many of these remasters, which now played at higher resolutions, had much sharper image quality and better-looking character models. Higher resolutions also decrease aliasing – the jagged edges that appear on rendered game models. This type of visual artefact can be quite distracting, and a HD display can help make for a more immersive experience by minimising aliasing artefacts. HD resolutions do not automatically imply photorealistic renders, but they can help bring out the detail in renders by achieving high image quality.
Advancements in Graphics Shaders
A shader is essentially a piece of code that runs on the GPU and contains specific instructions on how to render a 3D scene in pixels, or how to manipulate a 2D image before it’s shown on-screen. Shaders can tell the renderer how a 3D object should be lit, how it should be coloured, what it reflects and much, much more. Early graphics cards had fixed-function rendering pipelines, limiting the sort of effects that could be applied while rendering a scene. But the advent of cards with programmable shaders – the first of which was the GeForce 3 – utterly transformed 3D rendering. In fact, the use of normal and specular maps to texture an object requires a programmable shader pipeline.
Within a few years, developers had written highly complex shaders. In 2007, a programmer working on the CryTek engine developed screen-space ambient occlusion, which darkens the creases, holes and dents of an object, and the areas where it is contact with other objects, resulting in a more realistic scene that looks like it is responding to indirect or ‘ambient’ light. The shader was first used in Crysis (2007), a game now legendary for its demands on computer hardware. Crysis in fact contains more than 85000 shaders, which all but melted the graphics hardware of the time – and contributed greatly to the realism of the game.
Crysis 2 (2011) used a screen-space reflection shader to render reflections on glossy or glass-like surfaces and objects. The SSR shader contributes a lot to a rendered scene, but has its limitations, and is used in conjunction with other techniques such as cube mapping (also implemented via a shader) to create realistic in-game reflections.
Deferred rendering was another major screen-space shading technique developed in the late 2000s. In essence, this technique allows a scene to be rendered with many more lights, by rendering the game geometry and the scene lighting in separate rendering passes. In traditional, ‘forward’ rendering, increasing the number of lights can rapidly increase rendering times but deferred rendering enables more lights and more realistic world lighting without impacting render times significantly. Games such as Dead Space (2008) and Killzone 2 (2009) were among the first to implement deferred rendering and it has now become an industry standard.
The shaders described above improve lighting, shadows and reflections and thereby help make a scene look more realistic and believable. However, shaders are capable of many more effects, such as anti-aliasing, and contrast-adaptive sharpening, which allows games to be run at lower resolutions by intelligently sharpening the upscaled render. Such shaders can even be injected into games through custom programs like Reshade, which boasts an ever-growing library of shaders, and supports a huge variety of games
Physically-Based Rendering – a New Paradigm
The hunt for photorealism is punctuated with various paradigm shifts, and the advent of physically-based rendering (PBR) around the early 2010s is probably one of the most important. Graphics shaders had incrementally improved the look of games, striving for realism by improving how in-game scenes were rendered. Shaders could do everything from scene lighting to post-processing effects such as camera filters and depth of field. But these shaders were all working on models that were textured in what is now known as ‘traditional’ or ‘non-PBR’ workflows, where the diffuse and specular maps of game assets were painted by texture artists and did not generally reflect the real-world properties of such assets.
PBR is crucial to photorealistic renders, because the associated texture sets and shaders accurately model how light interacts with in-game objects. In real life, a shiny gold crown or copper bracelet will look gold or orange – not because these metals are in any way ‘pigmented’, but because they absorb certain wavelengths of light. PBR shaders model this accurately, giving a clear grey sheen to most metals, but giving coloured metals or alloys their characteristic tint based on what they absorb.
Even non-metallic objects are textured so that they will reflect a tiny bit of light, as they do in real life, and shiny non-metallics have a sheen based on their real-life properties. Reflective surfaces will accurately reflect in-game environments – physically-based textures even capture how reflective a body is based on the angle at which light hits it. PBR can hence improve the results from a screen-space reflection shader – if a smooth marble floor has a PBR-based texture, then it will be essentially opaque when you look directly down it, but will reflect scene objects if you look at it from an angle.
In fact, texture artists can make use of detailed reference tables when they adopt the PBR workflow for the sake of physical accuracy – such charts provide base values for texturing various metals and non-metals. PBR texturing can make life easier for artists – earlier, they would expend a good deal of effort into making a golden crown look appropriately golden, and then add surface details like dirt, dull cavities, scratches and more – now, they can add such realistic details while letting the renderer take care of making the object look ‘golden’.
Remember Me, a 2013 title published by Capcom, is credited as the first game to use physically-based rendering. Many studios soon transitioned to PBR, and held in-depth workshops to help texture artists adopt the new texturing pipeline.
The Advent of High-Dynamic Range Game Content
Both graphics shaders and physically-based rendering can work in synergy to enhance a scene by improving its lighting, shadows and reflections. The advent of high-dynamic range (HDR) TV in the mid-2010s utterly transformed this process by allowing games to output scenes with a greater dynamic luminance range, a wider colour range (known as the gamut), and more colour tones within this gamut (using higher bit depth).
Skies and the sun could look tremendously bright. Shadows in dungeons could look very dark, and in horror games, these areas were that much scarier, because you could just make out a lurking shape in the shadow. Roses in a bouquet could each have a subtly different shade of red. The gradient of colours in the horizon during sunsets could look much smoother.
Standard dynamic range (SDR) displays have a maximum luminance value of around 100 nits. Expensive HDR displays can get as bright as 4000 nits. This means that both the brightest and darkest parts of the render are displayed without losing detail – in an SDR display, these parts would fall above or below the screen’s luminance range and will look either bright white or pitch black. SDR displays can only show 8-bit colour information, or 256 levels of luminance for each colour channel. HDR displays have a bit depth of 10 bits per channel, resulting in 1024 shades between the brightest white and the darkest black, and can support over a billion colour tones.
This is why such displays (when showing HDR content) can make colours really pop, enhance the overall contrast of the scene, and smooth the gradient between light and dark colours during a sunset, eliminating banding artefacts with a wider range of colour tones.
Ironically, game engines had become capable of high dynamic range rendering (HDRR) by the early 2000s, but had no displays capable of showing such renders. Half-Life 2: Lost Coast (2005) was one of the first games to use HDRR and many other games performed render calculations in high dynamic range, but then squeezed the result into standard dynamic range using a process called tone-mapping. Just as a normal map texture is used to capture the geometric details of a high-polygon model, tone-mapping is used to map a HDR rendered frame onto a lower dynamic range. The result is better than what would have been generated without using HDRR, but is not true HDR output.
Horizon Zero Dawn (2017), Shadow of the Tomb Raider (2018) and Middle Earth: Shadow of War (2017) are among several games that were released soon after HDR displays became widespread, and such games support true HDR output, drastically improving image quality. Tone mapping still remains part of the workflow when creating HDR content, but the far wider luminance and colour range of a HDR display results in content whose dynamic range is not clamped, making for highly realistic lighting, colour and better overall image quality.
Real-Time Ray Tracing
In May 2020, the BBC published an article on real-time ray-tracing titled ‘Get ready for the ‘holy grail’ of computer graphics,’ and there is probably no better indication about the importance and primacy of this revolutionary technique in present-day game graphics.
Ray-tracing had long been a part of CGI (computer generated imagery) pipelines in film and television, but was implemented via offline rendering, and was prohibitively expensive – Toy Story 3 (2010) took an average of seven hours per frame, and Monsters University (2013) is said to have taken 29 hours per frame.
As ray-tracing actually models the interaction of light with in-game objects, it works best with physically-based textures, which provide accurate data to the ray-tracing algorithm. In fact, the first major book-length publication on PBR refers to ray-tracing, and contextualises PBR as a new method to improve such ray-traced scenes with physically-accurate materials – ray-tracing in films and TV predates PBR.
The first challenge to implementing ray-tracing in games is that it has to be in real time and not offline, and Nvidia’s first range of RTX cards, released in 2018, managed this feat. Real-time ray-tracing greatly enhances shadows, lights and reflections dynamically and it works thus – the GPU shoots rays from the camera and then calculates how these rays bounce off in-game objects, scene lights and other scene elements (like water bodies), to determine how the scene should look.
A ray that bounces off an object and hits a scene light determines how that object is lit and where its shadow falls – if the object is close to another, then contact shadows are drawn on both. Objects that deflect rays onto glass-like surfaces will be reflected by such scene elements. Rays that move from a light source to coloured objects will take on the objects’ hue and bathe nearby geometry with coloured light. Since the camera and the player character move constantly in games, such calculations have to be performed countless times. Ray tracing is even capable of recursive reflections, like infinite mirrors, though such reflections may not be feasible for complex scenes.
Ray-tracing greatly improves upon previous lighting solutions, like screen-space reflections, and a simple example can illustrate why. Imagine a third-person perspective scene in which the player character is facing a reflective glass shop front, near which are two barrels. Since the SSR shader can see the barrels (in the 2D render), it will paint their reflection on the glass. But the shader cannot see the front side of the player character, and thus cannot draw the appropriate reflection. Ray-tracing creates an accurate reflection of the whole scene by accounting for rays that hit the glass and then hit the player character (and vice versa), and also adds other off-screen objects to the reflection based on their position in the game world.
Nevertheless, real-time ray-tracing has high performance costs, and only one game – the indie title Stay in the Light (2020) – currently applies it across the board. Other games use it in specific contexts and combine existing methods along with ray-tracing to enhance a scene’s graphical fidelity. Metro Exodus: Enhanced Edition (2019) was hailed as ‘the first AAA ray-tracing game,’ and Control, released a few months later, achieved widespread acclaim for its implementation of ray-tracing.
It is no coincidence that Nvidia, AMD and Intel have all come out with upscaling algorithms soon after the advent of real-time ray tracing. Even the beefiest graphics card will slow to a crawl if it tries to render a game in native 4K with high-quality ray-tracing settings, and that’s where upscaling comes in – the GPU renders the image at a significantly lower resolution, which is then scaled up to (almost lossless) 4K. While Nvidia’s Deep Learning Super Sampling (DLSS) and Intel’s Xe Super Sampling (XESS) use machine learning, AMD’s Fidelity Super Resolution (FSR) does not, though it provides comparable results. Support for DLSS is available for both Control and Metro Exodus: Enhanced Edition, and upscaling algorithms benefit games that lack ray-tracing as well, giving them a significant performance boost – God of War on PC supports both DLSS and FSR.
3D game graphics have evolved over the course of two decades to create stunning visuals in present-day games. Developers have strived for realism and the best implementations of 3D techniques work synergistically to create near-photorealistic, or even hyper-realistic renders. HDR, which enhances colour, contrast and image quality, works best when PBR textures and ray-tracing accurately model how a scene interacts with light. Screen-space shaders are deployed alongside ray-tracing for performance gains, and normal mapping – by now a very old technique – is still critical in optimising scene geometry without losing detail. Some of these techniques represent paradigm shifts – PBR totally replaced traditional texturing workflows and real-time ray-tracing may well replace screen-space effects completely as graphics cards add muscle to their ray-tracing capabilities.
So, have we reached photorealism in gaming yet? Not quite – even the latest games are near-photorealistic but are still not indistinguishable from a live-action video. Real time ray-tracing is itself a very clever approximation of real life, or ‘ground truth’, as technologists like to call scenes observable to the human eye.
However, contemporary live-action content, such as the Jibaro episode from Netflix’s Love, Death and Robots, show that modern offline renders can pass for real life – in fact, the best CGI in live-action content is often in places where you don’t even expect it. This suggests that as computing capability increases and advanced ray-tracing methods such as path-tracing become feasible, we may edge closer to ground truth and true photorealism, even in gaming, where a life-like, interactive experience needs to be rendered at 60 frames per second.
Gameopedia offers custom solutions depending on your specific data requirements. Reach out to us for actionable insights about video game graphics and technology.