Notes · Other Peoples' Talks · FOSDEM 2026 · Writing axle OS's desktop compositor
https://fosdem.org/2026/schedule/event/ZC7LBJ-writing_axle_oss_desktop_compositor/
- No GPU driver; all software-rendered
- Can be pretty difficult to render an interactive UI with a playable framerate on the CPU
- Talk is about the work required to make this actually interactive
- Axle OS as a whole, including the userspace, is about a hundred-thousand lines of code
- The window manager is 5-6k lines of code
- It's pretty simple, conceptually, to build a compositor:
- documents on a desk - a "desk top" - and each of these have a z-index
- The compositor has a big framebuffer, and each application has its own framebuffer
- When the application is ready to render, the window manager draws its decoration, then copies ("blits") the application framebuffer into the desktop
- The simple render strategy:
- Draw backround
- Loop over the windows and draw them
- Draw the cursor
- But of course, we don't clear the frames, so we get cursor and window trails
- Really cool live demo using the real compositor!
- Every time we draw the frame, we clear the canvas. This works pretty well, and we could stop here, but... it's pretty slow
- This approach has low effort and low bugs, but it's not at all fast
- Overdraw results in a lot of wasted work - when we draw over something that was already rendered, there was no point in rendering the content underneath
- To work around this, we can split up the visible regions for each application and figure out what we actually need to render
- It is beneficial to merge adjacent rectangles as extra rectangles are expensive to render
- This is, again, conceptually simple, but there are literal edge cases to consider
- Ended up writing lots of unit tests to verify the "business logic"
- One of the main operations we end up doing in a compositor is to query what a rectangle intersects with
- To help with this, we can pick a data structure for spatial indexing
- Rectangle list O(n^2) can be optimised to O(n log n) with a R-tree
- When applied, this increases FPS from 1 FPS to 11 FPS
- The next optimisation: only redraw what has actually changed, instead of redrawing everything all the time
- This requires a significant amount of re-engineering, as state management is required for each of these
- To begin with, we need to redraw the entire frame whenever the mouse is moved; the compositor will queue up full redraws
- However, you can optimise further by only redrawing the area in which the cursor has moved (unioning previous cursor and last cursor)
- Whenever a region is dirtied, we queue redraws for that region (background, windows, etc)
- Whenever we drag a window, we do a full redraw for the affected region
- This extends to other window manipulation: z-order switches, interacting with buttons, etc
- Also need to do similar things for shortcuts: interaction, on initial draw, etc
- To help debug, added a way to record actions and replay them with a visual recording, making it easy to do a visual-diff when writing optimisations
- Easy way to make glitch art, though - lots of fun demos of broken optimisations
- Question: Would it make sense to multithread the drawing process between regions?
- The slow part is still writing to the video memory
- axleOS is SMP-compatible, but doesn't support multithreading processes
- You could theoretically multithread, but it's probably not the best play - the real solution is just to write a GPU driver
- Target a really old GPU that's well-documented
- All of these cards aren't that interesting, because they don't support high-resolution desktops
- Target a modern GPU
- Apparently reasonable thing to do some day, but they're very complex today
- Target a paravirtualised GPU driver
- Don't want to commit to having a different rendering path for virtualised hardware
- Target a really old GPU that's well-documented
- Question: Does it make sense to switch between brute-forcing and compositing based on the workload?
- It is possible, but didn't try because bifurcating the code paths would introduce bugs, and that would suck for hobby OSes
- Question: You say it's a memcpy, but this is rectangular
- Yes, it's a memcpy per row
- Question: Isn't VGA dead? Can't you use VESA?
- This is theoretically possible, yeah, didn't look into it