Wine-NSPA – Thread and Process Shared-State Bypass

This page documents the shared-state bypass for read-mostly thread and process queries, plus the zero-time process and thread wait fast paths built on the same published snapshots.

Overview
Coverage
Architecture
Thread query coverage
Process query coverage and zero-time waits
Correctness boundaries
Related docs

1. Overview

Wine already had an upstream shared-object publication mechanism for queue, window, class, input, and desktop state. Wine-NSPA extends that same seqlock-published shape to thread and process state, so a set of NtQueryInformationThread() and NtQueryInformationProcess() classes can be answered from shared memory instead of a wineserver RPC.

The same published state also powers zero-time WaitForSingleObject() polls for process and thread handles. For those single-handle, non-alertable, timeout-0 waits, ntdll can answer from the shared snapshot instead of paying an ntsync wait ioctl.

2. Coverage

Surface	Current behavior
Thread shared-state publication	wineserver publishes a per-thread shared object with seqlock update discipline and a per-handle locator RPC (`get_thread_shm`) for first resolve
Process shared-state publication	wineserver publishes a per-process shared object with the same seqlock shape and a matching first-resolve RPC (`get_process_shm`)
Thread query bypass	7 `NtQueryInformationThread()` classes are served shmem-first with RPC fallback
Process query bypass	6 `NtQueryInformationProcess()` classes are served shmem-first with RPC fallback
Zero-time thread wait	`WaitForSingleObject(thread, 0)` can answer from `thread_shm` and skip the ntsync ioctl on a hit
Zero-time process wait	`WaitForSingleObject(process, 0)` can answer from `process_shm` and skip the ntsync ioctl on a hit
Cache discipline	first use resolves the locator once; later reads are local; stale-slot detection and negative-cache entries force safe fallback instead of silent drift

The current query coverage is:

Thread: ThreadAffinityMask, ThreadQuerySetWin32StartAddress, ThreadGroupInformation, ThreadIsTerminated, ThreadSuspendCount, ThreadHideFromDebugger, ThreadPriorityBoost
Process: ProcessBasicInformation, ProcessTimes, ProcessPriorityBoost, ProcessAffinityMask, ProcessSessionInformation, ProcessPriorityClass

ThreadBasicInformation is intentionally left on the server path. The existing reply applies server-side transforms that are not mirrored in the published snapshot, so the public design keeps that one class authoritative instead of adding a special-case partial mirror.

3. Architecture

The bypass has two layers:

wineserver publishes thread and process snapshots inside the existing shared object union, using the normal seqlock write protocol
ntdll resolves a handle to its published object once, caches the locator, then serves later queries from a single seqlock snapshot read

The public point of the design is simple:

first query on a handle may still pay a small resolve RPC
later queries on that same handle do not
any ambiguity falls back to the original authoritative path

That is why this feature can land safely without changing Win32-visible semantics.

4. Thread query coverage

The thread snapshot carries the fields needed by the current read-mostly thread classes:

affinity
entry point
suspend count
priority-boost disable bit
terminated bit
debugger-hidden bit
thread and process ids

This boundary is deliberate. The point is not to force every thread query onto shared memory. The point is to retire the cheap, high-frequency, fixed-shape queries and leave the odd or transformed replies on the authoritative path.

5. Process query coverage and zero-time waits

The process snapshot carries enough state to answer the six current NtQueryInformationProcess() classes and to answer one additional hot liveness question: “has this process already exited?”

That second use matters because Wine’s in-process sync path already resolves a process handle to an ntsync-backed wait object. For WaitForSingleObject(proc, 0), ntdll can short-circuit before the wait ioctl:

if process_shm.exit_code still says the process is alive, return STATUS_TIMEOUT
if the exit code has already been published, return STATUS_WAIT_0

This is both faster and slightly more correct for Wine’s own layering, because it removes the small gap between the process info snapshot and the separate wait path.

The public process-query coverage is:

ProcessBasicInformation
ProcessTimes
ProcessPriorityBoost
ProcessAffinityMask
ProcessSessionInformation
ProcessPriorityClass

The fixed-shape, read-mostly part of that surface is local. Process image name queries, debug-object queries, variable-length payloads, and other server authority cases still use the original RPC path.

5.1 Zero-time thread wait

Thread handles get the same zero-time short-circuit shape, but the predicate is different. A thread exit code starts life at 0, which is a valid user exit code, so the thread fast path cannot use exit_code != 0 as a liveness test. It instead reads THREAD_SHM_FLAG_TERMINATED from the published thread snapshot:

terminated flag clear -> STATUS_TIMEOUT
terminated flag set -> STATUS_WAIT_0

That keeps the thread wait path honest while still removing the wait ioctl from the common zero-time poll case.

6. Correctness boundaries

Three parts make this safe enough to ship as the default behavior:

slot recycling check: the cached locator id is rechecked against the current shared object id after each read; mismatch evicts the cache entry and forces fallback
negative cache entries: handles that cannot resolve to a usable snapshot cache that miss explicitly, so repeat polls do not burn a fresh resolve RPC
class-by-class fallback: unsupported or transformed reply shapes stay on wineserver instead of being half-mirrored

That is the important discipline for this feature family. It is not trying to be clever about every thread or process query. It is publishing the read-mostly state that Wine can mirror honestly, reading it with the existing seqlock pattern, and refusing the rest.