How do graphics drivers work?

I’d like to give an overview on how graphics drivers work in general, and then write a little bit about the Linux graphics stack for AMD GPUs. The intention of this post is to clear up a bunch of misunderstandings that people have on the internet about open source graphics drivers.

What is a graphics driver?

A graphics driver is a piece of software code that is written for the purpose of allowing programs on your computer to access the features of your GPU. Every GPU is different and may have different capabilities or different ways of achieving things, so they need different drivers, or at least different code paths in a driver that may handle multiple GPUs from the same vendor and/or the same hardware generation.

The main motivation for graphics drivers is to allow applications to utilize your hardware efficiently. This enables games to render pretty pixels, scientific apps to calculate stuff, as well as video apps to encode / decode efficiently.

Organization of graphics drivers

Compared to drivers for other hardware, graphics is very complicated because the functionality is very broad and the differences between each piece of hardware can be also vast.

Here is a simplified explanation on how a graphics driver stack usually works. Note that most of the time, these components (or some variation) are bundled together to make them easier to use.

GPU firmware (FW) ― low-level code for power management, context switching, command processing, display engine, video encoding/decoding, etc.
Kernel driver, aka. kernel-mode driver (KMD) ― makes it possible for multiple userspace applications to submit commands to the GPU, and is responsible for memory management and display functionality.
Userspace driver, aka. user-mode driver (UMD) ― responsible for implementing an API, such as Vulkan, OpenGL, etc. For each piece of hardware, there may be multiple different UMDs implementing different APIs.
Shader compiler ― a userspace library that compiles shader programs for your GPU from the HW-independent code that applications have. Can be possibly shared between UMDs, sometimes developed as a separate project.

I’ll give a brief overview of each component below.

GPU firmware

Most GPUs have additional processors (other than the shader cores) which run a firmware that is responsible for operating the low-level details of the hardware, usually stuff that is too low-level even for the kernel.

The firmware on those processors are responsible for: power management, context switching, command processing, display, video encoding/decoding etc. Among other things it parses the commands we submitted to it, launches shaders, distributes work between the shader cores etc.

Some GPU manufacturers are moving more and more functionality to firmware, which means that the GPU can operate more autonomously and less intervention is needed by the CPU. This tendency is generally positive for reducing CPU time spent on programming the GPU (as well as “CPU bubbles”), but at the same time it also means that the way the GPU actually works becomes less transparent.

Kernel driver

You might ask, why not implement all driver functionality in the kernel? Wouldn’t it be simpler to “just” have everything in the kernel? The answer is no, mainly because there is a LOT going on which nobody wants in the kernel.

You don’t want to have your kernel crash when a game misbehaves. Sadly it can still happen, but it would happen a lot more if the kernel and userspace components weren’t separated.
You definitiely don’t want to run a fully-fledged compiler inside your kernel which takes arbitrary input from the user.
You want to avoid having to upgrade your kernel to deploy most fixes and improvements to the graphics stack. (This is not always avoidable but can be minimized.)

So, usually, the KMD is only left with some low-level tasks that every user needs:

Command submission userspace API: an interface that allows userspace processes to submit commands to the GPU, query information about the GPU, etc.
Memory management: deciding which process gets to use how much VRAM, defining GTT, handling low-memory situations, etc.
Display functionality: making display connectors work, by programming the registers of the display controller. There is also a separate uAPI for just this purpose.
Power management: making sure the GPU doesn’t draw too much power when not needed, and also making sure applications can get the best clock speeds etc. when that is needed, in cooperation with the power management firmware.
GPU recovery: when the GPU hangs or crashes for some reason, it’s the kernel’s responsibility to ensure that the GPU can be recovered and that the crash doesn’t affect other processes.

Userspace driver

Applications interact with userspace drivers instead of the kernel (or the hardware directly). Userspace drivers are compiled as shared libraries and are responsible for implementing one or more specific APIs for graphics, compute or video for a specific family of GPUs. (For example, Vulkan, OpenGL or OpenCL, etc.) Each graphics API has entry points which load the available driver(s) for the GPU(s) in the user’s system. The Vulkan loader is an example of this; other APIs have similar components for this purpose.

The main functionality of a userspace driver is to take the commands from the API (for example, draw calls or compute dispatches) and turn them into low level commands in a binary format that the GPU can understand. In Vulkan, this is analogous to recording a command buffer. Additionally, they utilize a shader compiler to turn a higher level shader language (eg. GLSL) or bytecode (eg. SPIR-V) into hardware instructions which the GPU’s shader cores can execute.

Furthermore, userspace drivers also take part in memory management, they basically act as an interface between the memory model of the graphics API and kernel’s memory manager.

The userspace driver calls the aforementioned kernel uAPI to submit the recorded commands to the kernel which then schedules it and hands it to the firmware to be executed.

Shader compiler

If you’ve seen a loading screen in your favourite game which told you it was “compiling shaders…” you probably wondered what that’s about and why it’s necessary.

Unlike CPUs which have converged to a few common instruction set architectures (ISA), GPUs are a mess and don’t share the same ISA, not even between different GPU models from the same manufacturer. Although most modern GPUs have converged to SIMD based architectures, the ISA is still very different between manufacturers and it still changes from generation to generation (sometimes different chips of the same generation have slightly different ISA). GPU makers keep adding new instructions when they identify new ways to implement some features more effectively.

To deal with all that mess, graphics drivers have to do online compilation of shaders (as opposed to offline compilation which usually happens for apps running on your CPU).

This means that shaders have to be recompiled when the userspace graphics driver is updated either because new functionality is available or because bug fixes were added to the driver and/or compiler.

But I only downloaded one driver!

On some systems (especially proprietary operating systems like Windows), GPU manufacturers intend to make users’ lives easier by offering all of the above in a single installer package, which is just called “the driver”.

Typically such a package includes:

Firmware files for all hardware that the package supports
A kernel driver
Several userspace drivers for various APIs
A shader compiler (sometimes more) that is used by those userspace drivers
A “user-friendly” application (ie. a control panel) to present all the functionality to the user
Various other utilities and libraries (that you may or may not need).

But I didn’t download any drivers!

On some systems (typically on open source systems like Linux distributions), usually you can already find a set of packages to handle most common hardware, so you can use most functionality out of the box without needing to install anything manually.

Neat, isn’t it?

However, on open source systems, the graphics stack is more transparent, which means that there are many parts that are scattered across different projects, and in some cases there is more than one driver available for the same HW. To end users, it can be very confusing.

However, this doesn’t mean that open source drivers are designed worse. It is just that due to their community oriented nature, they are organized differently.

One of the main sources of confusion is that various Linux distributions mix and match different versions of the kernel with different versions of different UMDs which means that users of different distros can get a wildly different user experience based on the choices made for them by the developers of the distro.

Another source of confusion is that we driver developers are really, really bad at naming things, so sometimes different projects end up having the same name, or some projects have nonsensical or outdated names.

The Linux graphics stack

In the next post, I’ll continue this story and discuss how the above applies to the open source Linux graphics stack.

The blog doesn't have comments, but feel free to reach out to me on IRC (Venemo on OFTC) or Discord (sunrise_sky) to discuss.

Timur's blog