Flexible I/O for Database Management Systems with xNVMe
One storage API to rule them all
Flexible I/O for Database Management Systems with xNVMe Emil Houlborg, Simon A. F. Lund, Marcel Weisgut, Tilmann Rabl, Javier González, Vivek Shah, Pınar Tözün CIDR’26
This paper describes xNVMe, a storage library (developed by Samsung), and demonstrates how it can be integrated into DuckDB.
xNVMe
Section 2 contains the hard sell for xNVMe. The “x” prefix serves a similar role to the “X” in DirectX. It is fast, while also being portable across operating systems and storage devices.
The C API will feel like home for folks who have experience with low-level graphics APIs (no shaders on the disk yet, sorry). There are APIs to open a handle to a device, allocate buffers, and submit NVMe commands (synchronously or asynchronously). Listing 3 has an example, which feels like “Mantle for NVMe”:
The xNVMe API works on Linux, FreeBSD, Windows, and macOS. Some operating systems have multiple backends available (e.g., libaio, io_uring).
DuckDB
The point of this paper is that it is easy to drop xNVMe into an existing application. The paper describes nvmefs, which is an implementation of the DuckDB FileSystem interface and uses xNVMe. nvmefs creates dedicated xNVMe queues for each DuckDB worker thread to avoid synchronization (similar tricks are used by applications calling graphics APIs in parallel).
The paper also describes how xNVMe supports shiny new NVMe features like Flexible Data Placement (FDP). This allows DuckDB to pass hints to the SSD to colocate buffers with similar lifetimes (which improves garbage collection performance).
Results
Most of the results in the paper show comparable performance for xNVMe vs the baseline DuckDB filesystem. Fig. 5 shows one benchmark where xNVMe yields a significant improvement:
Dangling Pointers
I think the long-term success of xNVMe will depend on governance. Potential members of the xNVMe ecosystem could be scared off by Samsung’s potential conflict of interest (i.e., will Samsung privilege Samsung SSDs in some way?) There is a delicate balancing act between an API driven by a sluggish bureaucratic committee, and an API which is dominated by one vendor.



