Skip to content

@moq/watch: fix playback stalls and frame-rate beating#1373

Open
skirsten wants to merge 3 commits intomoq-dev:mainfrom
skirsten:perf/playback
Open

@moq/watch: fix playback stalls and frame-rate beating#1373
skirsten wants to merge 3 commits intomoq-dev:mainfrom
skirsten:perf/playback

Conversation

@skirsten
Copy link
Copy Markdown
Contributor

@skirsten skirsten commented May 3, 2026

Detailed description of both fixes is in the commits.

closes #1367

Promise.race against a never-settling Promise (Effect#closed,
Effect#cancel) attaches a .then reaction to that long-lived promise on
every call. Because the promise never settles, those reactions never run
and never get GC'd. Each leaked reaction's closure retains the awaiter
state, which keeps the resolved {frame, group} alive.

Over a multi-minute playback session this accumulates one retained Frame
per consumed video frame (~400MB after 9 minutes at 60fps) and grows the
PromiseReaction chain to tens of thousands of nodes, which inflates major
GC pauses to 100–200ms. That's the cause of the periodic ~700ms stalls.

Three changes, no API impact:

* @moq/hang Consumer.next() waits via an AbortSignal listener registered
  with { once: true } and explicitly removed when notify wins, instead of
  Promise.race([wait, signals.closed]).

* @moq/watch video legacy decoder loop drops the outer
  Promise.race([consumer.next(), effect.cancel]). When the effect closes,
  effect.cleanup runs consumer.close(), which makes next() return
  undefined naturally and the loop exits.

* @moq/watch video and audio CMAF paths track in-flight groups in a Set.
  effect.cleanup closes them all, so recvGroup/readFrame terminate
  without racing against effect.cancel.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 3, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d3af5c58-4a5b-43df-9593-f8f3e99b6ad6

📥 Commits

Reviewing files that changed from the base of the PR and between 6e1a72f and 00eeac9.

📒 Files selected for processing (1)
  • js/watch/src/backend.ts

Walkthrough

Consumer wait/notification in the legacy container was rewritten to use AbortSignal listeners instead of a Promise.race. Sync removed its private coordination promise and deleted the wait(timestamp) method. Video and audio decoders now queue decoded frames and track open CMAF groups for explicit closure on cleanup; in-flight reads no longer race effect cancellation. Decoder removes the frame signal, adds consume() to select the newest ready VideoFrame via source.sync.now(), and DecoderTrack manages an internal queue. The renderer switched to a requestAnimationFrame loop that calls decoder.consume() and takes ownership of returned frames. MultiBackend.close() now closes video/audio sources and sync.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main changes: fixing playback stalls and frame-rate beating in the video renderer through a pull-based rendering approach.
Description check ✅ Passed The description references issue #1367 and commits for detailed explanations, which is appropriate for a multi-faceted fix PR.
Linked Issues check ✅ Passed The PR implements the pull mode feature and associated refactoring from #1367, including renderer mode prop, MultiBackend configuration, and necessary infrastructure changes to support pull-based rendering.
Out of Scope Changes check ✅ Passed Changes in Consumer.next(), CMAF decoder, Sync class removal, and MultiBackend.close() cleanup are all necessary supporting changes to enable pull-based rendering without hanging or resource leaks.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
js/watch/src/video/renderer.ts (1)

107-145: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

This makes pull mode unconditional instead of adding the requested mode switch.

The PR objective calls for mode: "push" | "pull" with "push" as the default and only the WebCodecs path opting into pull. This loop removes the old push behavior for every renderer instance, and there’s no API surface here to preserve existing callers’ scheduling semantics.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@js/watch/src/video/renderer.ts` around lines 107 - 145, `#runRender` currently
always uses a RAF-driven "pull" loop which removes the original "push"
semantics; modify `#runRender`(effect: Effect) to respect a mode ("push" | "pull",
default "push") sourced from the renderer instance (e.g., this.mode) or the
effect/options, and branch behavior: for mode === "pull" keep the
requestAnimationFrame loop and RAF cleanup as implemented (tick, rafId,
cancelAnimationFrame), but for mode === "push" remove the RAF loop and instead
subscribe to the decoder's push callback/event (e.g., decoder.onframe or
equivalent) to immediately call this.#draw(ctx, frame), update this.frame and
this.timestamp, and on effect.cleanup remove that listener; ensure
ownership/close semantics remain identical when handling frames in both
branches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@js/watch/src/video/renderer.ts`:
- Around line 111-135: The previewShown flag must be set any time we actually
paint a frame so a later empty vsync cannot paint a future preview; inside
tick() after you draw a frame (the branch where "if (frame) { ... }") set
previewShown = true (this covers frames returned from decoder.consume() and
cloned previews from decoder.frame.peek()), ensuring previewShown is not only
toggled in the preview branch; update references to decoder.consume,
decoder.frame.peek, previewShown, and the frame drawing block accordingly.

---

Outside diff comments:
In `@js/watch/src/video/renderer.ts`:
- Around line 107-145: `#runRender` currently always uses a RAF-driven "pull" loop
which removes the original "push" semantics; modify `#runRender`(effect: Effect)
to respect a mode ("push" | "pull", default "push") sourced from the renderer
instance (e.g., this.mode) or the effect/options, and branch behavior: for mode
=== "pull" keep the requestAnimationFrame loop and RAF cleanup as implemented
(tick, rafId, cancelAnimationFrame), but for mode === "push" remove the RAF loop
and instead subscribe to the decoder's push callback/event (e.g.,
decoder.onframe or equivalent) to immediately call this.#draw(ctx, frame),
update this.frame and this.timestamp, and on effect.cleanup remove that
listener; ensure ownership/close semantics remain identical when handling frames
in both branches.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0c87f457-6582-45a2-91e1-5a8eaf9819d6

📥 Commits

Reviewing files that changed from the base of the PR and between d592b00 and 0cc43a5.

📒 Files selected for processing (5)
  • js/hang/src/container/legacy.ts
  • js/watch/src/audio/decoder.ts
  • js/watch/src/sync.ts
  • js/watch/src/video/decoder.ts
  • js/watch/src/video/renderer.ts

Comment thread js/watch/src/video/decoder.ts Outdated
Comment thread js/watch/src/video/renderer.ts Outdated
Previously, frame pacing happened in the decoder via Sync.wait(), which
slept on a setTimeout / #update race per in-flight frame to wake near
vsync. The renderer then ran a single-shot rAF on every frame signal
update to paint. Two pacing layers, two places that could stall, and
systematic beating when the display refresh and content fps line up
(the wait resolves a fraction of a ms past vsync, the signal-driven rAF
schedules against the next vsync, so you display every other frame).

Move pacing to the renderer:

- DecoderTrack queues decoded frames immediately into #queue: VideoFrame[].
  The output callback no longer awaits anything.
- Decoder.consume() returns the newest queued frame whose PTS is
  ≤ sync.now(), closing any older entries; caller takes ownership.
  Late-frame drop and decode-buffer trim move here.
- The renderer's rAF calls decoder.consume() each vsync.
- Drop Sync.wait() and the #update notification machinery — buffer
  changes now take effect on the next vsync (≤16ms), which is fine for
  live media.

Drop the "first-paint preview" path and Decoder.frame mirror that fed
it. The preview was painting decoder.frame.peek() while the buffer
filled, which saved 20–100ms of black canvas at first paint but added
a special case to the render loop and held a cloned VideoFrame
resident at all times for every active track. Replace with two
narrower signals: DecoderTrack.display, set once from the first
decoded frame's displayWidth/displayHeight as a catalog-display
fallback; and #runBuffering now reacts to consume-time `timestamp`
instead of decode-time frame presence.

Net: one rAF loop in the system. Pacing happens at paint time, where
vsync is. Stalls show up as held frames instead of dropped frames.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
js/watch/src/video/renderer.ts (1)

107-131: ⚡ Quick win

Pause-aware rAF gating would reduce idle render wakeups.

Once paused and a preview frame already exists, this loop still schedules every vsync. Consider short-circuiting rAF in that state and re-entering when paused/frame changes.

Possible refactor
  `#runRender`(effect: Effect) {
    const ctx = effect.get(this.#ctx);
    if (!ctx) return;
+   const paused = effect.get(this.paused);
+   const currentFrame = effect.get(this.frame);
+   if (paused && currentFrame) return;

    let rafId: number | undefined;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@js/watch/src/video/renderer.ts` around lines 107 - 131, The render loop in
`#runRender` continues scheduling requestAnimationFrame even when paused and a
preview frame exists, causing unnecessary wakeups; update `#runRender` to gate
rAF: before scheduling the next raf check a paused flag and whether
this.frame.get() (or the equivalent preview frame) exists, and only call
requestAnimationFrame(tick) when not paused or when no preview frame is present;
also subscribe to changes on the pause state and the frame store so that when
paused becomes false or the frame is cleared you re-enter the raf loop (use the
same tick closure), and keep the existing effect.cleanup to cancel any
outstanding rafId via cancelAnimationFrame(rafId) to avoid leaks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@js/watch/src/video/renderer.ts`:
- Around line 107-131: The render loop in `#runRender` continues scheduling
requestAnimationFrame even when paused and a preview frame exists, causing
unnecessary wakeups; update `#runRender` to gate rAF: before scheduling the next
raf check a paused flag and whether this.frame.get() (or the equivalent preview
frame) exists, and only call requestAnimationFrame(tick) when not paused or when
no preview frame is present; also subscribe to changes on the pause state and
the frame store so that when paused becomes false or the frame is cleared you
re-enter the raf loop (use the same tick closure), and keep the existing
effect.cleanup to cancel any outstanding rafId via cancelAnimationFrame(rafId)
to avoid leaks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e7952ba1-2fb7-4ae4-9ea5-9cba34d379dc

📥 Commits

Reviewing files that changed from the base of the PR and between 0cc43a5 and 6e1a72f.

📒 Files selected for processing (3)
  • js/watch/src/sync.ts
  • js/watch/src/video/decoder.ts
  • js/watch/src/video/renderer.ts

MultiBackend.close() only closed its own #runElement signal, leaving
the Sync and the two Source instances (constructed as fields, not
inside an Effect) alive. Each owns its own Effect with several .run()
handlers that never get cleaned up, surfacing as "Signals was garbage
collected without being closed" warnings whenever a watch component
unmounts.

Close them explicitly, ordered so inner Decoder/Renderer/Emitter
(spawned via the #runElement effect) finish before the sources/sync
they reference are torn down.
@kixelated
Copy link
Copy Markdown
Collaborator

Can you make separate PRs for these commits? The 1st and 3rd make a lot of sense and should get merged.

The 2nd one with the consume() method is spooky and worries me a bit. Stuff like when the monitor refresh rate is smaller than the media frame rate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants