Skip to content

Add owned-buffer TX API and precise read-position snapshots for timestamped external sources#39

Open
salanki wants to merge 3 commits into
teodly:devfrom
salanki:spin2dante-owned-buffer
Open

Add owned-buffer TX API and precise read-position snapshots for timestamped external sources#39
salanki wants to merge 3 commits into
teodly:devfrom
salanki:spin2dante-owned-buffer

Conversation

@salanki
Copy link
Copy Markdown
Contributor

@salanki salanki commented Apr 6, 2026

This PR adds a small set of APIs to make Inferno easier to use as a transmit backend for externally scheduled, timestamped audio sources that are not driven by ALSA. These all came from developing my Sendspin to Dante bridge. I tried to modify Inferno as little as possible, and approaches tried as alternatives to some of these APIs are described in the broader text below. Please note these implementations are heavily AI assisted.

The main additions are transmit_from_owned_buffer() for writer-controlled owned TX buffers and ReadPositionSnapshot, which publishes a consistent (read_position, monotonic_time) pair from the TX thread at the exact point TX advances its read cursor. The motivation came from integrating Inferno into a protocol bridge, but the underlying need is more general: external TX clients need both a safe owned-buffer write path and a precise observation primitive for “what TX is consuming now,” and trying to reconstruct that entirely outside Inferno turned out to be much less reliable.

Summary (assisted by Codex)

This PR adds a small set of APIs that make Inferno easier to use as a transmit backend for external, timestamped audio sources that are not driven by ALSA.

It introduces:

  1. DeviceServer::transmit_from_owned_buffer()
  2. ReadPositionSnapshot for consistent (read_position, monotonic_time) observation from the TX thread
  3. an explicit ref_instant contract so snapshot timestamps can be reconstructed correctly by external writers

These changes were developed while integrating Inferno into a protocol bridge that receives audio chunks with presentation timestamps and needs to place them accurately into Inferno’s TX ring buffers. The APIs are general enough to be useful for other non-ALSA / externally scheduled TX clients as well.

Motivation

Inferno already exposes strong low-level TX primitives, but an external timestamped source needs two things that were awkward before this PR:

1. Owned TX buffers with writer-side control

For a timestamped source, the application needs to:

  • hold audio in its own scheduling queue
  • compute target write positions
  • write directly into TX ring buffers
  • rely on writer-visible readable_pos / hole-fix behavior

That is a better fit for Inferno-owned ring buffers than for externally wrapped buffers.

2. A precise observation point for “what the transmitter is consuming now”

The application also needs a trustworthy answer to:

“At what local monotonic time did the TX thread advance to this exact read position?”

A plain Arc<AtomicUsize> for read_position was not enough by itself, because an external client has to pair it with its own Instant::now(). That leaves a timing gap between:

  • when the TX thread updated read_position
  • and when the client sampled local time

In our case, that observer gap was large enough to make accurate cross-device alignment difficult. The fix was to move the observation into the TX thread itself and publish a consistent pair.

What this PR adds

transmit_from_owned_buffer()

This is a convenience API for starting TX from newly created owned ring buffers and returning the corresponding RBInput write handles to the caller.

This is useful for applications that want Inferno to remain the owner of the TX buffer implementation, but need direct control over when and where samples are written.

Compared with external-buffer TX, this keeps the existing owned-buffer semantics:

  • readable_pos tracking
  • hole detection/fill support
  • reads gated by what the writer has actually made readable

ReadPositionSnapshot

This publishes a consistent (read_position, monotonic_time) snapshot from the TX thread at the exact point where TX updates read_position.

The implementation uses a simple single-writer seqlock pattern:

  • odd seq while the writer is updating
  • even seq when the snapshot is stable
  • readers retry if the sequence changes mid-read

This avoids the race inherent in:

  • reading read_position
  • then calling Instant::now() elsewhere

explicit ref_instant

The first version of the snapshot API exposed elapsed nanoseconds only. That turned out to leave an implicit contract between producer and consumer about the time origin.

This PR makes that contract explicit by storing a ref_instant in the snapshot itself and reconstructing the snapshot Instant from:

  • ref_instant
  • monotonic_nanos

That makes the API self-contained and much less error-prone for external consumers.

Why this belongs in Inferno

I explored solving this entirely at the application layer first.

The application tried several approaches:

  • one-shot timestamp anchors
  • live retargeting from local time snapshots
  • bounded correction / servo logic
  • larger and smaller local buffers

Those approaches could improve parts of the behavior, but they all had to work around the same missing primitive: the application could not observe TX position and local monotonic time as one coherent event.

Once that observation moved into Inferno’s TX thread, the external scheduling logic became much simpler and much more stable.

So the main reason for this PR is not “support one specific project,” but:

  • Inferno already knows the exact moment it advances TX read position
  • that is the right place to publish this information
  • external timestamped TX clients should not have to reconstruct it indirectly

Backward compatibility

This PR is additive:

  • existing TX paths continue to work
  • existing callers of transmit_from_external_buffer() are unchanged
  • the new snapshot is optional
  • the existing read_position: Arc<AtomicUsize> path remains available

Files changed

  • inferno_aoip/src/device_server/mod.rs
  • inferno_aoip/src/device_server/flows_tx.rs

Commit structure

This PR intentionally keeps the changes as three small related commits:

  1. Add transmit_from_owned_buffer() for non-ALSA TX clients
  2. Add ReadPositionSnapshot for precise TX timing observation
  3. Expose ref_instant in ReadPositionSnapshot for correct time-base contract

Notes

If you prefer, I’m also happy to:

  • rename ReadPositionSnapshot
  • narrow the public API surface further
  • or split the owned-buffer API and snapshot API into separate PRs

My view is that they fit well together because they serve the same class of external TX client.

salanki added 3 commits April 6, 2026 09:54
Adds a public method to DeviceServer that creates owned ring buffers
and returns RBInput write handles to the caller. Unlike the
ExternalBuffer path, owned buffers track readable_pos properly, so:

- PositionReportDestination is updated on each write
- Buffer occupancy metrics are accurate
- unconditional_read() is false — inferno only reads validated data

Also re-exports OwnedBuffer, RBInput, RBOutput, new_owned_ring_buffer
from the device_server module for external consumers.
Adds a seqlock-style shared snapshot that pairs (read_position, monotonic_nanos)
at the exact point FlowsTransmitter updates read_position. This gives external
buffer writers (like spin2dante) a consistent observation of when and where
the TX thread is reading, without the imprecision of sampling read_pos and
Instant::now() separately.

- ReadPositionSnapshot struct with seq/read_position/monotonic_nanos atomics
- Threaded through FlowsTransmitter::start(), run(), transmit_from_owned_buffer()
- Written at the TX update site with odd/even seqlock protocol
- Backward compatible: existing callers pass None for the snapshot
…ract

The TX thread's monotonic_nanos are relative to a reference Instant that
only it knows. Previously the bridge had to guess. Now the snapshot
exposes ref_instant via a Mutex<Option<Instant>>, set once at TX start.
Readers reconstruct the snapshot instant as ref_instant + monotonic_nanos.
@salanki salanki force-pushed the spin2dante-owned-buffer branch from fddbdb3 to eb78bec Compare April 7, 2026 20:20
@salanki salanki marked this pull request as draft April 8, 2026 03:38
@salanki salanki force-pushed the spin2dante-owned-buffer branch from f9f181f to 5b1c9d1 Compare April 8, 2026 04:09
@salanki salanki marked this pull request as ready for review April 8, 2026 04:14
@teodly
Copy link
Copy Markdown
Owner

teodly commented Apr 11, 2026

  • transmit_from_owned_buffer - looks useful, I didn't implement it in the first place because ALSA plugin with external buffer was the only way of transmitting.
  • "ReadPositionSnapshot This publishes a consistent (read_position, monotonic_time) snapshot from the TX thread at the exact point where TX updates read_position."
    • If this is really needed for time-aligned playback, it indicates an architectural problem. The only timestamp that should be needed is from the MediaClock. The way Dante works is that receivers have a buffer and introduce configured latency to compensate for transmitter's and network's jitter. So position being playing by the receiver is guaranteed to be media_clock.now_ns() - max(remote_device.rx_latency, local_device.TX_LATENCY_NS). So actual TX position should not be needed for synchronized audio transfers.
    • If media_clock and OS monotonic clock relationship needs to be measured, a better way would be ensuring that Inferno uses usrvclock, not PTP clock directly, and grabbing virtual clock parameters from usrvclock structure.
      • It would be even better to allow PTP hardware clocks and use PTP_SYS_OFFSET API for difference/drift measurement. But that's a job for usrvclock-rs library to expose an unified API no matter whether the clock in use is physical or virtual - I can do it myself if needed.
  • Please do not use this project's git history as an advertisement space for Claude. The way git frontends (e.g. GitHub) render commit authors makes the logo too exposed (it would be visible even on the front page of this repo). Put the information about help of AI, in commit description, but not in a way that is parsed as git authorship metadata.

@salanki salanki force-pushed the spin2dante-owned-buffer branch from 5b1c9d1 to 5ad4ffd Compare April 13, 2026 00:48
@salanki
Copy link
Copy Markdown
Contributor Author

salanki commented Apr 13, 2026

Thanks you for the review!

I agree that the cleaner long-term abstraction is not “expose a TX-thread snapshot”, but rather:

  • use MediaClock for playout timing
  • expose enough usrvclock / host-clock correlation information that an external client can derive the needed mapping without observing the TX path directly

For this bridge, the immediate requirement is a coherent cross-domain anchor between:

  • Sendspin server time
  • the local monotonic clock used by ClockSync
  • the local Inferno TX playout position used for ring writes

MediaClock.now_ns() gives continuous media/PTP time, but it does not by itself give a coherent pair with local monotonic time, and it does not expose the actual packet-quantized TX cursor (start_ts = flow.next_ts + timestamp_shift) that Inferno is reading from.

So with the APIs available today, replacing ReadPositionSnapshot with current usrvclock usage would lose two things:

  • an exact (read_position, local_monotonic_time) observation from the TX thread
  • direct observation of the actual TX read cursor, rather than an inferred continuous media-clock position

Why that matters for my bridge:

The bridge is not just trying to answer “what should be audible on the Dante network right now?” It is trying to decide “what ring position should I write this Sendspin chunk to, right now, so that a chunk with server timestamp T is consumed at the correct time by this specific transmitter?” I am wanting to keep both different DANTE receivers synchronized, as well as other sendspin protocol receivers. I am synchronizing time across two time domains.

For that anchor, I need a coherent pair:

  • the local TX read position
  • and the corresponding local monotonic/server time at the same observation instant

If those are sampled separately, any gap between them becomes anchor error. That was the original problem I was trying to solve.

Also, the bridge needs the position of the actual cursor Inferno is reading from, not just an ideal continuous media-clock position, because writes are scheduled into a concrete ring buffer that is consumed in packet-sized steps. In the current TX path, the read position is effectively:

  • start_ts = flow.next_ts + timestamp_shift

and flow.next_ts is packet-quantized / re-bootstrapped inside Inferno. So for my use case, a continuous media-clock estimate is not quite the same thing as the actual per-transmitter read cursor I need to align writes against.

That is why ReadPositionSnapshot was useful here: it gave me an exact observation of the real TX cursor together with its corresponding local monotonic time, without having to reconstruct that relationship outside the TX thread.

I agree that a better usrvclock API would be the preferable long-term solution here, including:

  • exposing virtual-clock correlation cleanly
  • eventually supporting a unified path for physical PTP clocks as well

It may indeed make more sense to solve this in usrvclock-rs first, and then simplify Inferno around that.

Given the current API surface, I see two options:

  1. Slightly tweak my PR for inferno-only:
  • keep the snapshot functionality for now
  • narrow its public surface so it is Inferno-owned and opaque
  • create it internally in transmit_from_owned_buffer()
  • return it as a read-only handle to callers
  • keep only the public read method

That would preserve the functionality needed by external timestamped writers today, while reducing the public API commitment and leaving room to replace it with a better usrvclock-based correlation API later.

  1. The other option would be for me to attempt the usrvclock changes needed to expose the APIs there + any related inferno changes.

Let me know what is good for you.

Also noted on the metadata. I kept it in to show it was not hand coded. I have removed it.

@teodly
Copy link
Copy Markdown
Owner

teodly commented Apr 13, 2026

So with the APIs available today, replacing ReadPositionSnapshot with current usrvclock usage would lose two things:

an exact (read_position, local_monotonic_time) observation from the TX thread

This pair could be read from clock overlay structure or PTP_SYS_OFFSET.

direct observation of the actual TX read cursor, rather than an inferred continuous media-clock position

I've already implemented it as a workaround for crackling in some ALSA apps, but currently it's disabled by default (USE_FLOWS_CLOCK) - not needed for PipeWire & JACK. The next timestamp that transmitter will start sending is written here:

self
.current_timestamp
.store(cur_ts_opt.unwrap_or(usize::MAX), Ordering::SeqCst /*TODO: really needed?*/);
.

How low latency do you need? Observing TX ringbuffer's read pointer would make sense if you need to write to this buffer just before it is transmitted, i.e. have a latency comparable to Dante packet size. But for music player app it looks like overkill.

I agree that latency consistency is important, but given that Dante guarantees predictable latency (measured from PTP clock, not audio flows), media_clock-monotonic_clock difference/drift should be sufficient for this.

The other option would be for me to attempt the usrvclock changes needed to expose the APIs there + any related inferno changes.

I think it's the way to go. Clock measurements are not transmitter-specific and belong to usrvclock and media_clock.rs.

@salanki
Copy link
Copy Markdown
Contributor Author

salanki commented Apr 13, 2026

<0.5ms to keep everything nicely tightly in sync. I'm at 0.33 measured worst case sync difference with the current approach in my tests. p50 is much lower (0.02ms)

@teodly
Copy link
Copy Markdown
Owner

teodly commented Apr 18, 2026

<0.5ms to keep everything nicely tightly in sync

But you can fill the buffer earlier that 0.5ms before the time it will be sent, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants