Wine-NSPA – librtpi (PI mutex / condvar)

This page documents Wine-NSPA’s Wine-internal librtpi shim for PI-aware Unix-side mutexes and condition variables.

Table of Contents

  1. Overview
  2. Why librtpi, not glibc pthread
  3. Header-only shim, not vendored static lib
  4. pi_mutex_t – futex PI on a raw word
  5. NSPA_RTPI_MUTEX_RECURSIVE extension
  6. pi_cond_t – requeue-PI condvar
  7. The librtpi sweep tool
  8. Compile-line discipline (include/rtpi.h forwarder)
  9. Consumers in the Wine tree
  10. Commit history
  11. References

1. Overview

librtpi is a small library that gives PE/Unix code two PI-aware synchronization primitives:

Upstream librtpi (gitlab.com/linux-rt/librtpi) is a Unix-native static library. Wine-NSPA does not vendor that source tree; it carries a header-only re-implementation of the same public API at libs/librtpi/rtpi.h, plus a Wine-internal extension – NSPA_RTPI_MUTEX_RECURSIVE – that the upstream library deliberately does not provide.

librtpi shim: callers, primitives, kernel surface Wine Unix-side callers ntdll/unix, win32u, winex11.drv, winejack.drv, winealsa.drv, winegstreamer, file/cdrom/system libs/librtpi/rtpi.h header-only, ~450 LoC pi_mutex_t / pi_cond_t API cached gettid in __thread var Linux futex syscalls FUTEX_LOCK_PI / FUTEX_UNLOCK_PI / FUTEX_TRYLOCK_PI FUTEX_WAIT_REQUEUE_PI / FUTEX_CMP_REQUEUE_PI kernel rt_mutex PI chain pi_mutex_t (64-byte aligned union) { futex, flags, nspa_recursion } | pad[64] futex word: TID | FUTEX_WAITERS | FUTEX_OWNER_DIED user-space CAS fast path, FUTEX_*_PI on contention pi_cond_t (128-byte aligned union) { cond, flags, wake_id, state } | pad[128] FUTEX_WAIT_REQUEUE_PI atomically requeues waiter onto paired pi_mutex on wake NSPA extension: NSPA_RTPI_MUTEX_RECURSIVE re-entrance counter, only touched by the current owner; no atomics on the recursion path required for virtual_mutex (signal-handler re-entrance via guard-page stack-growth)

2. Why librtpi, not glibc pthread

pthread_mutex_t from glibc is NPTL-backed and does not carry PI by default. glibc supports PTHREAD_PRIO_INHERIT as a mutex protocol attribute, but:

librtpi takes the inverse approach. The lock word is the futex operand. There is no opaque pthread bookkeeping. The wake path on the condvar side is an explicit FUTEX_CMP_REQUEUE_PI so the kernel hands the waiter to the mutex’s PI chain atomically. That is semantically different from glibc’s POSIX pthread implementation, and the difference is visible at the syscall trace level: librtpi calls are SYS_futex with the _PI operations, glibc calls are not.

For Wine-NSPA’s RT audio workload, that semantic difference is the whole point. The audio thread (SCHED_FIFO, prio 80) acquiring an internal Wine mutex held by a SCHED_OTHER worker has to inherit its priority onto the holder until release. With glibc pthread that does not happen for any of Wine’s internal mutexes. With librtpi it happens unconditionally for every converted call site, with no runtime gating.


3. Header-only shim, not vendored static lib

Upstream librtpi is a small autotools project (Makefile.am, pi_mutex.c, pi_cond.c, pi_futex.h, ~600 LoC, last release 2024). The earliest Wine-NSPA approach (NSPA RT v2.0) tried to vendor the source tree under libs/librtpi/ and build it as a Wine-internal static library. That hit autotools obstacles repeatedly:

NSPA RT v2.0.1 pivoted to a header-only re-implementation. The libs/librtpi/Makefile.in declares no build target; it exists only to host rtpi.h. The header re-implements the public API as inline functions on top of the Linux futex syscalls Wine-NSPA already used for CS-PI.

The resulting layout matches upstream’s pi_mutex_t and pi_cond_t field union exactly (down to the 64-byte / 128-byte padding), so any code written against upstream librtpi compiles unchanged against the NSPA shim. The NSPA additions (nspa_recursion, wake_id, state) live inside the existing padding; upstream callers that do not know about those fields are unaffected.


4. pi_mutex_t – futex PI on a raw word

The struct:


typedef union pi_mutex {
    struct {
        uint32_t futex;          /* low 30 bits = owner TID,
                                  * bit 30 = FUTEX_OWNER_DIED,
                                  * bit 31 = FUTEX_WAITERS,
                                  * 0 when unowned */
        uint32_t flags;
        uint32_t nspa_recursion; /* NSPA extension; only touched
                                  * by the current owner */
    };
    uint8_t pad[64];
} pi_mutex_t __attribute__((aligned(64)));

The flags field carries RTPI_MUTEX_PSHARED to switch between FUTEX_*_PI (process-shared) and FUTEX_*_PI_PRIVATE (process-local).

Lock fast path

User-space CAS for the contention-free case:


if (__atomic_compare_exchange_n(&mutex->futex, &expected, tid, 0,
                                __ATOMIC_ACQUIRE, __ATOMIC_RELAXED))
    return 0;

If the CAS fails because the mutex is already owned by the current TID, the call returns EDEADLK (or bumps the recursion counter – see Section 5).

If the CAS fails because the mutex is owned by another TID, the slow path issues FUTEX_LOCK_PI (or its private variant), which blocks the caller and applies PI to the holder until the mutex is released.

Unlock fast path


if ((mutex->futex & 0x3fffffffu) != tid) return EPERM;
if (__atomic_compare_exchange_n(&mutex->futex, &expected, 0, 0,
                                __ATOMIC_RELEASE, __ATOMIC_RELAXED))
    return 0;
/* slow path: kernel unlock, wakes the highest-priority waiter */
return syscall(SYS_futex, &mutex->futex, op, 0, NULL, NULL, 0);

If there are no waiters (FUTEX_WAITERS bit clear), the user-space CAS releases the mutex; otherwise the kernel resolves the waiter queue and wakes the highest-priority waiter.

TID cache

Every pi_mutex_lock / pi_mutex_unlock needs gettid(). The header caches the TID in a __thread variable:


static __thread uint32_t nspa_rtpi_cached_tid;

static inline uint32_t nspa_rtpi_tid(void)
{
    uint32_t tid = nspa_rtpi_cached_tid;
    if (!tid) {
        tid = (uint32_t)syscall(SYS_gettid);
        nspa_rtpi_cached_tid = tid;
    }
    return tid;
}

First call per thread pays the syscall cost; every subsequent call is a single load. There is no atomic on the cache because each thread writes its own slot.


5. NSPA_RTPI_MUTEX_RECURSIVE extension

Upstream librtpi does not support recursive mutexes. That is a deliberate library-design choice, not a kernel limitation – the kernel’s FUTEX_LOCK_PI itself does not allow re-entrance, so a recursive layer has to be entirely user-space.

Wine-NSPA needs recursion for at least one mutex: virtual_mutex in dlls/ntdll/unix/virtual.c, which is genuinely re-entered from within signal handlers. The guard-page stack-growth path (virtual_setup_exception) re-enters the address-space lock from a faulting thread that already holds it. Wine’s pre-NSPA code achieved that with pthread_mutexattr_settype(..., PTHREAD_MUTEX_RECURSIVE); the librtpi sweep cannot drop that mutex because plain pi_mutex_t returns EDEADLK on self-re-lock.

The NSPA extension is minimal:


#define NSPA_RTPI_MUTEX_RECURSIVE 0x10

When the flag is set on pi_mutex_init, pi_mutex_lock / pi_mutex_trylock on a mutex already owned by the current thread bump nspa_recursion instead of returning EDEADLK; pi_mutex_unlock decrements it and only releases the futex word when it hits zero:


if ((mutex->futex & 0x3fffffffu) == tid) {
    if (mutex->flags & NSPA_RTPI_MUTEX_RECURSIVE) {
        mutex->nspa_recursion++;
        return 0;
    }
    return EDEADLK;
}

if (mutex->flags & NSPA_RTPI_MUTEX_RECURSIVE) {
    if (mutex->nspa_recursion > 0) {
        mutex->nspa_recursion--;
        return 0;
    }
}

The recursion counter is only touched by the current owner (as determined by the futex word), so no atomics are needed on it. The extension is a strict superset: a mutex initialized without the flag behaves exactly like upstream librtpi.

The nspa_recursion field lives in the existing 64-byte padding of the union; binary layout is unchanged from upstream, and upstream code that does not know about the field is unaffected.

Recursive-mutex use sites in the Wine tree

File Mutex Why recursive
dlls/ntdll/unix/virtual.c:3941 virtual_mutex signal-handler re-entrance via guard-page stack-growth path
dlls/win32u/sysparams.c:5888 user_mutex nested win32u call paths
dlls/win32u/gdiobj.c:1040 gdi_lock nested GDI object lookups
dlls/winex11.drv/init.c:58 (winex11 lock) nested X11 driver locking

6. pi_cond_t – requeue-PI condvar


typedef union pi_cond {
    struct {
        uint32_t cond;       /* sequence counter, incremented per signal */
        uint32_t flags;
        uint32_t wake_id;    /* signal generation, EAGAIN retry logic */
        uint32_t state;      /* RTPI_COND_STATE_READY */
    };
    uint8_t pad[128];
} pi_cond_t __attribute__((aligned(64)));

The condvar uses FUTEX_WAIT_REQUEUE_PI / FUTEX_CMP_REQUEUE_PI so that waiters are atomically requeued from the condvar futex onto the PI mutex’s futex on signal. This closes the priority-inversion window that exists with plain FUTEX_WAIT, where a woken thread has to manually relock the mutex and there is a gap with no PI boost in effect.

Wait path


pi_cond_timedwait(cond, mutex, abstime):
  cond->cond++
  wake_id = cond->wake_id
again:
  futex_id = cond->cond
  pi_mutex_unlock(mutex)
  ret = syscall(SYS_futex, &cond->cond, FUTEX_WAIT_REQUEUE_PI,
                futex_id, abstime, &mutex->futex, 0)
  if ret == 0: return 0          /* kernel requeued us; we own the mutex */
  pi_mutex_lock(mutex)            /* error path: relock manually */
  if errno == EAGAIN and state == READY:
    if cond->wake_id != wake_id:  return 0  /* signal raced us; stay awake */
    cond->cond++
    goto again
  return errno

Signal path


pi_cond_signal(cond, mutex):
again:
  cond->cond++
  cond->wake_id = cond->cond
  ret = syscall(SYS_futex, &cond->cond, FUTEX_CMP_REQUEUE_PI, 1,
                /* requeue 0 */, &mutex->futex, cond->cond)
  if ret >= 0: return 0
  if errno == EAGAIN: goto again
  return errno

FUTEX_CMP_REQUEUE_PI wakes one waiter and requeues zero (signal wakes exactly one). pi_cond_broadcast is the same shape with requeue = INT_MAX: wake one, requeue the rest directly onto the mutex’s PI chain so the thundering-herd path becomes “all woken threads compete for the mutex’s PI chain, kernel hands it to the highest-priority requeued waiter on unlock”.

Pre-requeue history

The first version of the librtpi shim (NSPA RT v2.0) used plain FUTEX_WAIT / FUTEX_WAKE with a sequence counter – functionally correct but not requeue-PI. The 2026-04-15 follow-on upgraded it to the requeue-PI variant. After that the wait/wake gap closed: every Wine-side condvar in the converted set retains PI across the wake.


7. The librtpi sweep tool

nspa/librtpi_sweep.py is the automated rewriter that converts Wine Unix-side pthread_mutex_* and pthread_cond_* use sites to the librtpi equivalents. It runs against the Wine tree any time it needs to re-apply the conversion (after a Wine version sync, after pulling upstream patches that touched converted files, etc).

Two-phase design

  1. Pairing discovery: scan every target .c file, extract every pthread_cond_wait(C, M) / pthread_cond_timedwait(C, M, T) call, and build a multimap cond_expr -> { mutex_expr, ... } keyed on the textual expression of the cond argument.
  2. Rewrite: walk each file’s libclang AST and emit in-place rewrites for pthread_* -> pi_*. For pthread_cond_signal / pthread_cond_broadcast, look up the paired mutex in the map built in the earlier discovery pass. (pi_cond_signal and pi_cond_broadcast need the paired mutex argument; upstream pthread does not.)

Hard-fail rules

The sweep refuses to silently emit FIXMEs or fall back to heuristics:

Recursive mutexes

Mutexes initialized via pthread_mutexattr_settype(..., PTHREAD_MUTEX_RECURSIVE) cannot be converted by the sweep itself; the sweep detects and skips them so they remain pthread_mutex_t. The NSPA_RTPI_MUTEX_RECURSIVE extension is then applied by hand at those specific sites (Section 5 lists the four current sites). This keeps the sweep a “no decisions, full automation” tool: anything that needs human judgment is left alone.


8. Compile-line discipline (include/rtpi.h forwarder)

Every Wine compile line carries -Iinclude -I../include. If a system copy of upstream librtpi is installed at /usr/include/rtpi.h, that path could shadow the NSPA header on some build configurations. That matters because upstream pi_mutex_t does not carry the nspa_recursion field: if any DLL accidentally compiled against the system header, the resulting object code would mis-lay out pi_mutex_t and silently misbehave at runtime – different parts of Wine would see different pi_mutex_t layouts and the recursive path through virtual_mutex would break.

include/rtpi.h exists exactly to prevent that. It is a forwarder:


#ifndef __WINE_INCLUDE_RTPI_H
#define __WINE_INCLUDE_RTPI_H

#include "../libs/librtpi/rtpi.h"

#endif

include/ is searched first by every Wine compile line, so the forwarder always wins before the system search path is consulted. Every Wine DLL in the tree then automatically picks up the NSPA version with no per-Makefile.in -I$(top_srcdir)/libs/librtpi hack.


9. Consumers in the Wine tree

The librtpi sweep + the manual recursive-mutex carries cover 57 files under dlls/, libs/, server/, and programs/. Selected sites:

Subsystem File Notes
ntdll/unix core dlls/ntdll/unix/virtual.c recursive virtual_mutex (signal handlers)
ntdll/unix core dlls/ntdll/unix/server.c server-side wait/signal helpers
ntdll/unix core dlls/ntdll/unix/sched.c per-instance sched lock
ntdll/unix core dlls/ntdll/unix/file.c file-table mutex
ntdll/unix core dlls/ntdll/unix/cdrom.c cdrom device mutex
ntdll/unix core dlls/ntdll/unix/system.c system-info mutex
ntdll/unix nspa dlls/ntdll/unix/nspa/rt.c RT helpers
ntdll/unix nspa dlls/ntdll/unix/nspa/local_file.c local-file fast-path
ntdll/unix nspa dlls/ntdll/unix/nspa/local_timer.c sched-hosted local timers
win32u dlls/win32u/sysparams.c recursive user_mutex
win32u dlls/win32u/gdiobj.c recursive gdi_lock
winex11.drv dlls/winex11.drv/init.c recursive driver lock
audio (JACK) dlls/winejack.drv/jack.c JACK callback PI
audio (JACK) dlls/winejack.drv/jackmidi.c MIDI ring
audio (ALSA) dlls/winealsa.drv/alsa.c ALSA stream lock
audio (ALSA) dlls/winealsa.drv/alsamidi.c ALSA MIDI
audio (Core) dlls/winecoreaudio.drv/coremidi.c CoreAudio MIDI
gstreamer dlls/winegstreamer/wg_parser.c parser state
gstreamer dlls/winegstreamer/wg_allocator.c allocator
nsiproxy dlls/nsiproxy.sys/icmp_echo.c ICMP table mutex
nsiproxy dlls/nsiproxy.sys/ndis.c NDIS table mutex
signal core dlls/ntdll/unix/signal_x86_64.c signal-frame mutex

The sweep tool is re-runnable on each Wine version sync; the recursive sites are stable manual carries.


10. Landing history

The commits below trace librtpi’s introduction and evolution in the Wine-NSPA tree. Times are author timestamps from git log on the wine-rt-claude/wine submodule.

Date Subject
2026-04-11 vendor librtpi + automated sweep rule
2026-04-11 pivot to Wine-internal header-only rtpi.h
2026-04-11 vendor PI-futex mutex library
2026-04-11 automated pthread_* -> pi_* rewriter
2026-04-11 sweep taint-propagation cleanup + ntdll/unix conversion
2026-04-11 thread.c first-pass conversion
2026-04-11 win32u sweep + rtpi.h forwarder
2026-04-11 broad DLL sweep + cond pairing / header fixups
2026-04-15 pi_cond upgrade to FUTEX_WAIT_REQUEUE_PI / FUTEX_CMP_REQUEUE_PI
2026-04-15 add pi-cond requeue-PI benchmark
2026-04-30 per-instance sched refactor + pi_mutex multi-class prep

Notable later-than-bring-up clarification:


11. References

Wine-NSPA source

Upstream

Cross-references