Skip to content

refactor(cmcd): vendor @svta/cml-cmcd, replace custom encoder + state machine#16

Open
littlespex wants to merge 69 commits into
mainfrom
feat/cmcd-cml-refactor
Open

refactor(cmcd): vendor @svta/cml-cmcd, replace custom encoder + state machine#16
littlespex wants to merge 69 commits into
mainfrom
feat/cmcd-cml-refactor

Conversation

@littlespex
Copy link
Copy Markdown
Member

@littlespex littlespex commented Apr 29, 2026

Summary

Replaces shaka's custom CMCD wire-format encoding and state machine with a vendored Closure port of @svta/cml-cmcd (CML) under third_party/cml-cmcd/. Implementation in three sequential phases off feat/cmcd-cml-refactor:

  • Phase 1 (Tasks 1.1-1.13) vendors the port and routes shaka's encoders through it. Three intentional wire-format changes (nor URL relativization, 'ld'/'lh' dropped, V2 SFV-conformant encoding).
  • Phase 2 (Tasks 2.1-2.5) dedupes shaka's internal enums and key arrays in favor of cml.cmcd.* equivalents. Pure dedupe; no behavior change.
  • Phase 3 (Tasks 3.1-3.15) rewrites lib/util/cmcd_manager.js (1580 → ~700 LoC) as a thin adapter around cml.cmcd.CmcdReporter. Deletes the state machine, sequence-number tracking, event timing, and mode-selection logic. Ships experimental v2 config renames (targetseventTargets, per-target timeIntervalinterval) and adds public EventType / PlayerState re-exports. Three additional wire-format alignments inherited from CML.

Phases 1 + 2 are behavior-preserving for shaka's public API, modulo documented intentional wire-format changes that align shaka with CTA-5004 / CTA-5004-B and CML's spec-conformant output. Phase 3 is the load-bearing behavioral change — the manager's external surface stays compatible, but the internal state machine, sequence-number scope, and event-mode dispatch path all flow through CML's reporter.

What's vendored

third_party/cml-cmcd/ mirrors the CML repo's libs/cmcd/src/ plus two shim files (cml_utils.js, cml_sfv.js), pinned at v2.3.0 / commit 22390e35dfbbe1e53d15648d3aace99cdf71f9dd**.

When shaka migrates to TypeScript (#8262), the vendored directory deletes and each goog.require('cml.cmcd.X') becomes import { X } from '@svta/cml-cmcd' — no other code changes.

Architectural decisions in the port

These are judgment calls during sub-phase B+C; please weigh in:

  1. setInterval whitelisted in build/conformance.textproto for third_party/cml-cmcd/cmcd_reporter.js. The reporter calls setInterval directly for periodic time-interval event reporting. Alternatives considered: (a) patching the vendored reporter to use shaka.util.Timer (breaks verbatim parity, complicates per-bump diff workflow), (b) filing a CML upstream PR for injectable-timer support (delays Phase 1). Whitelist is pragmatic but means Phase 3 tests can't inject a fake timer — will need jasmine.clock() for interval testing.
  2. fetch whitelisted in build/conformance.textproto for third_party/cml-cmcd/cml_utils.js. Used by cml.cmcd.defaultRequester — dead code at runtime because the shaka adapter always supplies a custom requester via NetworkingEngine. Closure ADVANCED is expected to strip the dead path; the conformance check runs first.
  3. defaultRequester relocated from CmcdReporter.ts module scope into cml_utils.js as cml.cmcd.defaultRequester. Done to centralize the fetch whitelist scope to a single file. Per-bump CML diff workflow needs to know defaultRequester lives here, not in the reporter — otherwise CML 2.4.0+ bumps will look like the function vanished.

Wire-format changes (intentional)

Three changes; all align shaka with CTA-5004-B and CML's spec-conformant output. Test assertions updated to match.

1. nor URLs become root-relative + V2 inner-list format.

  • V1: nor="next-seg.m4v" (path-relative against request URL).
  • V2: nor=("next-seg.m4v") (RFC 8941 inner-list, root-relative against request origin per CTA-5004-B § 4.1).

cmcd_manager.js no longer pre-relativizes data.nor; CML's nor formatter relativizes against options.baseUrl (request URL's origin). For event-mode reports sent to a different-origin collector, nor stays absolute.

2. 'ld' and 'lh' StreamingFormat values dropped.

CTA-5004 / CTA-5004-B define only 'd' (DASH), 'h' (HLS), 's' (Smooth), 'o' (Other). 'ld' and 'lh' were from an old unreleased CMCD draft and are non-spec. Wire change: low-latency DASH content now emits sf=d; low-latency HLS now emits sf=h. setLowLatency() no longer mutates sf_; the LL flag is preserved for any external readers but has no effect on the encoded sf value.

3. V2 SFV-conformant encoding.

CML uses RFC 8941 Structured Field Values for V2 output; old shaka used a JSON-stringify-shaped quoting rule. Concrete differences:

  • Token vs string formatting for e (event type) and sta (player state) values. Old: e="ps", sta="s". New: e=ps, sta=s. Per CTA-5004-B these values are spec-defined single-character tokens; SFV tokens don't take quotes.
  • v=2 always present in V2 output. Per CTA-5004-B § 4.1, V2 output MUST include v. Old shaka emitted v only when explicitly present in input data. CML's prepareCmcdData enforces this, even when the user filters v out via includeKeys.
  • ts no longer in request-mode output. Per CTA-5004-B ts is event-mode only. Old shaka emitted ts=<timestamp> in request-mode CMCD; CML's request-mode filter correctly drops it. (Event-mode and response-received reports continue to include ts.)

Behavior preserved otherwise

The state machine, sequence numbers (cmcdSequenceNumbers_ per-target counters), event timing, request/response routing, and the public shaka.util.CmcdManager API are unchanged. The setLowLatency, setMediaElement, configure, reset, applyRequestData, applyResponseData, appendSrcData, appendTextTrackData entry points retain their existing signatures and semantics. Phase 2 dedupes constants/enums; Phase 3 rewrites the state machine as a thin adapter around CmcdReporter and is the load-bearing behavioral change.

Implementation summary

shaka.util.CmcdManager static encoders now delegate:

  • serialize(data, options)cml.cmcd.encodeCmcd(data, options).
  • toQuery(data, options)cml.cmcd.encodeCmcd(data, options) (preserves shaka's "raw value, no CMCD= prefix" contract — CML's toCmcdQuery returns the prefixed form, which would break callers).
  • toHeaders(data, options)cml.cmcd.prepareCmcdData(data, options) once on the full input, then bucketed into shaka's existing 4-shard headerMap, then cml.cmcd.encodePreparedCmcd per shard. Calling the high-level encodeCmcd per shard would re-run prepareCmcdData and re-add v=2 to every non-empty shard.
  • appendQueryToUri(uri, query) retained as a goog.Uri-based adapter — CML's appendCmcdQuery(url, cmcd, options) takes a data object, not a pre-encoded query string, so direct delegation isn't possible. Phase 3 deletes call sites entirely.
  • urlToRelativePath deleted; the helper and its 9 unit tests are gone.

A new private getEncodeOptions_(uri, version, reportingMode) static helper builds cml.cmcd.CmcdEncodeOptions objects at the four CMCD-encoding call sites: appendSrcData, appendTextTrackData, sendCmcdRequest_ (event/response path), applyCmcdDataToRequest_ (request path). It threads this.config_.version and the reporting mode (CMCD_REQUEST_MODE for request paths, CMCD_EVENT_MODE for sendCmcdRequest_).

Verification (all three phases)

  • python3 build/check.py --force exits 0 (lint, conformance, types, spelling).
  • python3 build/all.py --force exits 0 (full bundle build: dash/hls/compiled/ui/experimental, debug + release).
  • python3 build/test.py --filter Cmcd — 57 / 57 pass (Phase 3 deletes ~125 wire-format tests now in CML, adds adapter glue + smoke).
  • python3 build/test.py --quick — 2927 / 2927 pass (no regressions outside CMCD; 4 environmental skips on this branch are unrelated to this work).
  • Demo smoke test on bbb-dark-truths/dash.mpd — V1+query, V2+query, V2+headers (with unload→configure({useHeaders: true})→load cycle) all emit correct CMCD output. Sequence numbers 0-based, reset on sid change. v=2 only in CMCD-Session shard. Public re-exports (EventType, PlayerState, StreamingFormat) all readable at runtime from the compiled bundle. Zero CMCD-related console errors.

littlespex and others added 30 commits April 27, 2026 17:05
Spec the migration from shaka's custom CMCD integration to @svta/cml-cmcd
via a vendored closure-port in third_party/cml-cmcd/. CmcdManager becomes
a thin adapter over CmcdReporter; encoding, validation, and state
tracking move upstream to CML as the single source of truth.

Three-phase plan: vendor + delegate encoding, dedupe constants, then
swap state machine for CmcdReporter. Vendored port is transitional and
deletes when shaka adopts TypeScript (shaka-project#8262).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Compared the spec against hls.js shaka-project#7725 and dash.js shaka-project#4816 (the two open
PRs migrating to @svta/cml-cmcd) and addressed 11 gaps:

Correctness:
- Inner-list array encoding for v2 keys (br, tb, bl, mtp, nor)
- NaN guards on bl/tb/mtp before encoding
- Distinguish reporter.update() from reporter.recordEvent()
- Add reporter.start()/stop(flush)/flush() lifecycle
- Event-mode transport is POST + body, not query/headers

Design clarifications:
- Player-state deduplication in adapter
- 'nor' becomes root-relative (intentional spec-conformance change)
- 'v=1' omitted from output
- Public re-exports: CmcdEventType, CmcdPlayerState
- Per-target eventTargets[] field shape enumeration
- Reporter chosen always-on (unified request + event mode path)

Plus: architecture diagram refresh, CML API confirmations from source,
Phase 1 reframed for the 'nor' change, Phase 3 adds demo updates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plan-vs-spec attribution was wrong: the enum-name guesses for pe/pc/t/c/b
live in plan.md:86 (the Task 0.3 description), not spec.md:530-534 (which
lists only two-letter codes). Also expanded the requester misnomer
callout to cover both spec.md:449 and spec.md:511.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…c lists

The "Update spec.md immediately" section mixed plan.md and spec.md
items under a spec-only heading and double-listed the requester
misnomer. Split into separate "Update plan.md" and "Update spec.md"
subsections; combined the two requester entries; added line-ref
specificity to each spec.md item. Also added a quoted spec assertion
to gap #3 for symmetry with the other gap entries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Reclassify CmcdStreamingFormat finding: shaka's 'ld'/'lh' are non-spec
  (from an old unreleased CMCD draft); drop in Phase 1 instead of filing
  upstream. No CML changes required.
- Apply 5 spec/plan doc fixes from cml-version.md (interval rename,
  recordResponseReceived shape, requester as positional arg, no Blob
  body, plan.md:86 enum-name guesses).
- Update plan.md Phase 1 to include the LL value drop as a wire-format
  change alongside nor URL relativization (new Task 1.10b).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the third_party/cml-cmcd/ directory with LICENSE, NOTICE, and
SUMMARY.txt copied from CML at tag cmcd-v2.3.0
(commit 22390e35dfbbe1e53d15648d3aace99cdf71f9dd). Source files in
subsequent commits.

CML and shaka-player both Apache 2.0; no license-compatibility issues.
This vendored port is transitional — once shaka migrates to TypeScript
(issue shaka-project#8262), the directory deletes and `goog.require('cml.cmcd.X')`
becomes `import { X } from '@svta/cml-cmcd'`.
Adds 20 closure typedef files under cml.cmcd.* corresponding to CML's
type-only TypeScript modules in libs/cmcd/src/. Each file declares a
single namespaced @typedef with no runtime body.

Skipped from this commit (will land in Phase 1 sub-phase C alongside
CmcdReporter): CmcdReporterConfig, CmcdReportConfig, CmcdRequestReport,
CmcdRequestReportConfig, CmcdEventReportConfig — all reporter-only.

Translation notes:
- TypeScript intersection types (`A & B`) cannot be expressed in
  closure typedefs; the inheriting types (CmcdResponse, CmcdEvent,
  Cmcd) inline the resolved superset of fields directly. All members
  are optional, matching CML's "no key is mandatory" wire format.
- `keyof T` and `ValueOf<T>` constructs widen to `string` / `*`
  respectively. Effective key sets are enumerated by the constants
  ported in Task 1.4 (CMCD_KEYS, CMCD_REQUEST_KEYS, etc.).
- Structured-field `SfItem<T, ...>` wrappers cannot be statically
  expressed; CmcdObjectTypeList widens to `Array<*>` and CmcdFormatter
  return widens to `*`. Wire output is unaffected.
- Type-only TS imports erase; cross-typedef references use closure
  namespace lookup via `goog.require`.
Adds 8 closure @enum {string} files under cml.cmcd.*:

- cmcd_object_type.js
- cmcd_streaming_format.js  (DASH/HLS/SMOOTH/OTHER — no LL variants)
- cmcd_stream_type.js
- cmcd_player_state.js
- cmcd_event_type.js
- cmcd_reporting_mode.js
- cmcd_transmission_mode.js
- cmcd_header_field.js

Enum string values verified verbatim against CML cmcd-v2.3.0 source
(`libs/cmcd/src/Cmcd*.ts`). For enum files that also expose individual
named-constant exports (CmcdEventType, CmcdReportingMode,
CmcdTransmissionMode, CmcdHeaderField), each `export const X = 'literal'`
becomes a separate `goog.provide`d `cml.cmcd.X` constant alongside the
enum so consumers can `goog.require` either form.
Adds 15 constant-module files under cml.cmcd.* corresponding to CML's
data-only constant modules:

- cmcd_default_time_interval.js
- cmcd_event_keys.js
- cmcd_header_map_const.js  (the data; CmcdHeaderField enum is separate)
- cmcd_inner_list_keys.js  (Set instance)
- cmcd_key_types.js  (CMCD_KEY_TYPES + CMCD_V1_KEY_TYPE_OVERRIDES)
- cmcd_keys.js  (composed from request/response/event/v1 lists)
- cmcd_mime_type.js  (application/cmcd)
- cmcd_param.js  (the CMCD query param name)
- cmcd_request_keys.js
- cmcd_response_keys.js
- cmcd_string_length_limits.js  (+CMCD_CUSTOM_KEY_VALUE_MAX_LENGTH)
- cmcd_token_values.js
- cmcd_v1_const.js  (the numeric `CMCD_V1 = 1` constant)
- cmcd_v1_keys.js
- cmcd_v2_const.js  (the numeric `CMCD_V2 = 2` constant)

Naming note: TS files `CmcdV1.ts` (typedef) and `CMCD_V1.ts` (numeric
const) both map to `cmcd_v1.js` under the strict "TS filename →
snake_case" rule. Disambiguated by suffixing the numeric-const file
with `_const` (likewise `cmcd_v2_const.js`). Same disambiguation for
`CMCD_HEADER_MAP.ts` vs the `CmcdHeaderMap` typedef →
`cmcd_header_map_const.js`.

Deferred: CMCD_FORMATTER_MAP.ts is intentionally NOT ported in this
commit. It defines runtime functions that depend on `SfItem` from
`@svta/cml-structured-field-values` (not in the vendoring scope) and
`urlToRelativePath` from `@svta/cml-utils` — both of which require
infrastructure that lands with the encoders in Task 1.6. Its only
consumer is `prepareCmcdData.ts` (also Task 1.6), so deferring is
self-contained.

Values verified verbatim against CML cmcd-v2.3.0 source.
Adds third_party/cml-cmcd/cml_utils.js providing `cml.cmcd.uuid` as a
thin wrapper around `crypto.randomUUID()`. Replaces CML's transitive
dependency on `@svta/cml-utils` for the one runtime call site in
`CmcdReporter`'s default-`sid` codepath.

The shaka adapter (Phase 3) always sets `sid` explicitly when building
a `CmcdReporterConfig`, so this codepath is dead at runtime and
Closure ADVANCED will strip it. The shim exists purely so the vendored
`CmcdReporter` source stays verbatim with upstream — making per-bump
diffs trivial.

`crypto.randomUUID()` is polyfilled in browsers without native support
by `lib/polyfill/random_uuid.js`.
- Add the 44 third_party/cml-cmcd/*.js files to build/types/core,
  pulling Task 1.9's build integration forward from after Task 1.8.
  Resolves the 'complete' build check being red between sub-phases.
  As a side effect this enables Closure type-checking for the port,
  which surfaced three dangling goog.requires in the typedef-
  intersection files (cmcd.js, cmcd_event.js, cmcd_response.js) that
  were unused because those typedefs inline the resolved superset of
  fields rather than referencing the parent typedef. Removed.
- Document the _const filename disambiguation pattern in
  third_party/cml-cmcd/SUMMARY.txt.
- Backfill circular-require widening rationale into cmcd.js,
  cmcd_event.js, cmcd_response.js to match cmcd_request.js.

`python3 build/check.py` now passes clean (lint, complete, spell,
type-check all green).
Sub-phase B prerequisite. The CMCD encoders pull in three runtime
imports from CML's sibling packages:

- `urlToRelativePath` from `@svta/cml-utils` (used by
  `CMCD_FORMATTER_MAP`'s `nor` formatter to relativize URLs).
- `SfItem` / `SfToken` constructors from
  `@svta/cml-structured-field-values` (used by `prepareCmcdData`,
  `CMCD_FORMATTER_MAP`, and `toCmcdValue` to wrap typed structured
  field values).
- `encodeSfDict` (the RFC 8941 §4.1 dictionary serializer that
  `encodePreparedCmcd` and `toPreparedCmcdHeaders` route every CMCD
  byte through).

Surface from the encoders' POV is small (4 callables), but
`encodeSfDict` pulls in the structured-field-values serializer's
transitive closure (~13 internal helpers, ~600 LoC vendored verbatim).
Since CMCD wire output is impossible without a structured-fields
encoder, vendoring is preferable to npm-dependency adoption per
AGENTS.md. Same per-bump diff stability strategy as `cml_utils.js`.

- Extend `cml_utils.js` with `urlToRelativePath` (vendored verbatim).
- Add new `cml_sfv.js` exposing `SfItem`, `SfToken`, `encodeSfDict`.
  Internal helpers prefixed `cml.cmcd.SfvImpl_*_` to mark them
  shim-private; not intended as a public surface.
- Wire `cml_sfv.js` into `build/types/core` alphabetically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sub-phase A deferred this constant because its body imports `SfItem`
from `@svta/cml-structured-field-values` and `urlToRelativePath` from
`@svta/cml-utils`. With both shimmed in the prior commit, port the
constant verbatim from CML.

Filename `cmcd_formatter_map_const.js` follows sub-phase A's `_const`
disambiguation pattern: the typedef `CmcdFormatterMap` already lives at
`cmcd_formatter_map.js`, so the constant lands as
`cmcd_formatter_map_const.js`.

The `roundValue`/`toRounded`/`hundredValue`/`toHundred`/`toUrlSafe`/
`nor` helpers from upstream become module-private functions prefixed
`cml.cmcd.CMCD_FORMATTER_MAP_*_`. Algorithm logic mirrors upstream
`CMCD_FORMATTER_MAP.ts` exactly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sub-phase B Task 1.6. Port the 11 encoder files from spec § "Encoders":

- `append_cmcd_headers.js` (`appendCmcdHeaders`)
- `append_cmcd_query.js` (`appendCmcdQuery`)
- `encode_cmcd.js` (`encodeCmcd`)
- `encode_prepared_cmcd.js` (`encodePreparedCmcd`)
- `ensure_headers.js` (`ensureHeaders`)
- `prepare_cmcd_data.js` (`prepareCmcdData`)
- `to_cmcd_headers.js` (`toCmcdHeaders`)
- `to_cmcd_query.js` (`toCmcdQuery`)
- `to_cmcd_url.js` (`toCmcdUrl`)
- `to_cmcd_value.js` (`toCmcdValue`)
- `to_prepared_cmcd_headers.js` (`toPreparedCmcdHeaders`)

Algorithm logic preserved verbatim against upstream cmcd-v2.3.0.

The encoders pull in seven CMCD support modules that the spec marked
as "tooling-only Excluded" but that `prepareCmcdData` and
`toPreparedCmcdHeaders` actually call at runtime — port them too:

- `is_cmcd_custom_key.js` (`isCmcdCustomKey`)
- `is_cmcd_event_key.js` (`isCmcdEventKey`)
- `is_cmcd_request_key.js` (`isCmcdRequestKey`)
- `is_cmcd_response_received_key.js` (`isCmcdResponseReceivedKey`)
- `is_cmcd_v1_key.js` (`isCmcdV1Key`)
- `is_token_field.js` (`isTokenField`)
- `is_valid.js` (`isValid`)
- `group_cmcd_headers.js` (`groupCmcdHeaders`)

`groupCmcdHeaders` is shared between encoders and decoders; we vendor
the encoder-side without the decoder counterparts.

Closure type adjustments:

- `Object.entries(...)` returns `Array<[?, ?]>`; cast keys back to
  `string` at index sites (`prepare_cmcd_data.js`,
  `to_prepared_cmcd_headers.js`).
- `prepareCmcdData(obj, options)` re-binds `options || {}` to a typed
  local `opts` so Closure does not narrow it to the empty-object type.
- `first.params.r` widens through the SfItem `params` typedef via
  inline cast.
- TS generic `toCmcdValue<V, P>` erases to a single-shape JSDoc.

Build wiring: append all 19 entries to `build/types/core`
alphabetically. `python3 build/check.py` passes clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sub-phase B Task 1.7. Port the two version-resolution helpers from
spec § "Helpers":

- `up_convert_to_v2.js` (`upConvertToV2`) — wraps v1 scalar values in
  inner-list arrays and `nor` strings in arrays for v2 output.
- `resolve_version.js` (`resolveVersion`) — picks an explicit version
  override, falls back to payload `v`, defaults to `CMCD_V1`.

`resolveVersion`'s upstream signature uses `CmcdValidationOptions`,
which lives among the validation typedefs that spec § "Excluded
validators" omits. Inline the option shape (only the `version` field
is read) instead of vendoring the typedef.

`SUMMARY.txt` updated to note the included predicates, the
`urlToRelativePath` cml-utils addition, and the new `cml_sfv.js` shim
covering the structured-field-values encoder surface.

Build wiring: append both new entries to `build/types/core`
alphabetically. `python3 build/check.py` passes clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ish cml_sfv

- The phase 1A polish commit (e6cb256) dropped 'dangling' goog.requires
  in cmcd.js, cmcd_event.js, cmcd_response.js. These should have been
  converted to goog.requireType instead — the typedefs reference
  cml.cmcd.CmcdObjectTypeList in JSDoc only (no runtime use). Restoring
  as requireType fixes 42 JSC_MISSING_REQUIRE_TYPE_IN_PROVIDES_FILE
  errors in build/build.py.
- Drop misleading @template V, P from cml_sfv.js SfItem class JSDoc
  (the V/P params are erased to * — documented in to_cmcd_value.js).
- Simplify encodeSfDict's `(options && options.whitespace)` to
  `options.whitespace` — options has a default value, so the && guard
  is dead code and diverged from upstream.

Verified: python3 build/check.py and python3 build/build.py both pass
clean (0 errors).
Ports the 5 deferred typedefs that sub-phase A held back because they
are consumed only by `CmcdReporter` (sub-phase C):

- `CmcdReportConfig.ts` -> `cmcd_report_config.js`
- `CmcdEventReportConfig.ts` -> `cmcd_event_report_config.js`
- `CmcdRequestReportConfig.ts` -> `cmcd_request_report_config.js`
- `CmcdReporterConfig.ts` -> `cmcd_reporter_config.js`
- `CmcdRequestReport.ts` -> `cmcd_request_report.js`

TS interface inheritance (`extends`) collapses to inlined record fields
per the sub-phase A precedent. Generic parameters (`HttpRequest<D>`,
`CmcdRequestReport<D>`) erase; `HttpRequest`'s structural shape is
inlined into `CmcdRequestReport` since `@svta/cml-utils` is type-only
here.

Files wired into `build/types/core` alphabetically among the existing
`cml-cmcd/` block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Final vendored-port file for sub-phase C: ports `CmcdReporter.ts`
(~459 lines) to `cmcd_reporter.js` as a Closure ES6 class. With this
commit, the entire `third_party/cml-cmcd/` port is complete; sub-phase
D will start delegating shaka's `CmcdManager` through CML.

Constructor signature `(config, requester)` preserved as positional
args (per `cml-version.md` Task 0.4 / 0.8 verification — NOT a single
config object). Public surface exposed: `start`, `stop(flush)`,
`flush`, `update(partialState)`, `recordEvent(type, data)`,
`createRequestReport(request, data)`, `recordResponseReceived(response,
data)`. Internal helpers (`processEventTargets_`, `sendEventReport_`,
`recordTargetEvent_`, `resetSession_`) preserved verbatim.

TS-to-Closure idioms applied:
- TS class private fields (`private foo`) → Closure trailing-underscore
  fields with `@private`
- `keyof` / `ValueOf<>` generics erase; widen to `string` / `*`
- Type intersections (`CmcdReporterConfigNormalized = CmcdReporterConfig
  & {sid, eventTargets}`) → inline record fields per sub-phase A
  precedent
- Generic method signatures (`<R extends HttpRequest>`) erase
- Module-scope helpers (`createEncodingOptions`,
  `createCmcdReporterConfig`) namespaced under `cml.cmcd.CmcdReporter_*_`

Shim surface extension:
- `cml.cmcd.defaultRequester` added to `cml_utils.js`. Wraps `fetch` in
  the same dead-codepath pattern as `cml.cmcd.uuid` — shaka adapter
  always supplies a custom requester via NetworkingEngine, so this is
  Closure-strippable.

Conformance whitelist extensions (one entry each):
- `setInterval` and `window.setInterval` → `cmcd_reporter.js`. Upstream
  uses `setInterval` directly in `start()`; replacing with
  `shaka.util.Timer` would break the verbatim-port property and the
  per-bump diff stability.
- `fetch` → `cml_utils.js` (for the dead-codepath `defaultRequester`
  shim only).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…porter

- cmcd_reporter_config.js and cmcd_request_report_config.js: typedef
  field 'transmissionMode' was widened to plain 'string' with an
  incorrect comment about avoiding a circular goog.require.
  CmcdTransmissionMode has zero requires, so there's no cycle.
  Tighten to cml.cmcd.CmcdTransmissionMode with goog.requireType.
- cmcd_reporter.js: lines 391 and 411 used '||' falsy fallback while
  other sites in the same file use the faithful '!= null ?' form.
  Unify to '!= null ?' for consistency with upstream's '??' semantics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a "Current status" section to plan.md surfacing where the refactor
stands at session-end: Phase 0 + Phase 1 sub-phases A/B/C complete, the
vendored port at third_party/cml-cmcd/ structurally done, sub-phase D
the next resume point. Includes CML pin + reclone command, three
architectural decisions worth surfacing in the eventual Phase 1 PR
description (setInterval/fetch whitelist, defaultRequester relocation),
and a hygiene note about two existing git stashes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace shaka.util.CmcdManager.serialize / toQuery static method bodies
with one-line delegations to cml.cmcd.encodeCmcd, and thread an optional
CmcdEncodeOptions through serialize / toQuery / toHeaders so callers can
pass a baseUrl. CML's `nor` formatter uses options.baseUrl to produce
root-relative URLs.

Drop urlToRelativePath (and its 9 unit tests). data.nor is now set to
the absolute next-segment URL in getDataForSegment_; CML relativizes it
during encode against new URL(requestUri).origin. This is the first of
Phase 1's two intentional wire-format changes — `nor` becomes
root-relative instead of path-relative, matching CTA-5004-B and CML's
spec-conformant output. Sub-phase E will catalog the full diff.

Add a small static helper getEncodeOptions_(uri) that derives baseUrl
from a URL's origin (with offline:/parse-error guards). Used at all
four CMCD-encoding call sites: appendSrcData, appendTextTrackData,
sendCmcdRequest_, applyCmcdDataToRequest_.

appendQueryToUri keeps its goog.Uri-based body: cml.cmcd.appendCmcdQuery
takes a data object instead of a pre-encoded query string, so this
shaka method is retained as an adapter for now. Phase 3 removes these
adapter methods entirely as part of the CmcdReporter rewrite.

Per Task 1.10 of plans/cmcd-cml-refactor/plan.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CTA-5004 and CTA-5004-B define only 'd', 'h', 's', 'o' for the sf key.
shaka's 'ld' (LOW_LATENCY_DASH) and 'lh' (LOW_LATENCY_HLS) come from an
old unreleased draft and are not in either spec. CML correctly omits
them. Wire change: LL DASH now emits sf=d, LL HLS now emits sf=h.

setLowLatency body simplified to a single field assignment — the LL flag
no longer mutates this.sf_, since both LL DASH and DASH share 'd'. The
flag is preserved for any external readers; sf_ is set once at
manifest-load time in getStreamFormat_, which itself drops its
this.lowLatency_ branches.

This is the second of Phase 1's two intentional wire-format changes,
alongside `nor` URL relativization (committed previously). Phase 2's
alias re-export becomes a straightforward identity map once shaka's
enum is a strict subset of CML's CmcdStreamingFormat.

No tests asserted sf=ld or sf=lh (verified via grep over test/), so
the test-file delta from this change is empty.

Per Task 1.10b of plans/cmcd-cml-refactor/plan.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update plan.md "Current status" section: sub-phase D (Tasks 1.10 +
1.10b) ✅ complete after this session, sub-phase E is the resume
point. Add a "Sub-phase D landing notes" subsection capturing the
three concrete deliverables (encoder delegation, urlToRelativePath
deletion, two intentional wire-format changes) and the new private
helper getEncodeOptions_(uri).

Replace the old "Resuming work — sub-phase D prep" subsection with
"sub-phase E prep": baseline-capture flow now uses
`git checkout 0f69e7f -- ...` (since sub-phase D is committed
across d8d614c + adfdfab) rather than the plan's literal
git stash flow which assumed uncommitted changes. Pre-flag the
likely (c)-class divergence in toHeaders (CML auto-adds v=2 per
shard; shaka's old encoder didn't), so sub-phase E starts with
the highest-priority item identified.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three (c)-class bugs surfaced by sub-phase E diff testing of the
encoder delegation from Task 1.10:

C1. Event-mode encoded output dropped event-only keys (e, ts, cen).
    cml.cmcd.prepareCmcdData defaults reportingMode to CMCD_REQUEST_MODE
    when not specified, and the request-mode filter (isCmcdRequestKey)
    excludes event-only keys. Fix: thread reportingMode through
    getEncodeOptions_ — CMCD_EVENT_MODE for sendCmcdRequest_ (the
    out-of-band event/response-received path), CMCD_REQUEST_MODE for
    applyCmcdDataToRequest_ / appendSrcData / appendTextTrackData.
    sendCmcdRequest_ also stops passing baseURL: target URLs are
    typically a different origin from segment URLs, so baseUrl-based
    nor relativization against the collector URL is meaningless. CML's
    own reporter omits baseUrl in event mode for the same reason.

C2. toHeaders emitted v=2 in every shard. shaka groups data into 4
    shards and previously called serialize per shard; once serialize
    delegated to encodeCmcd, each shard ran prepareCmcdData and got
    v=2 auto-appended. Fix: prepareCmcdData once on the full data
    (auto-adding v=2 once into the prepared output), bucket the
    prepared keys into shaka's headerMap, then encodePreparedCmcd per
    shard so re-preparation is skipped.

C3. V1 configurations were getting V2 filter behavior because
    prepareCmcdData defaults to V2 when no version is supplied. Tests
    that emitted V1-only keys (e.g. nrr) saw them filtered out. Fix:
    thread this.config_.version through getEncodeOptions_.

Per Task 1.12 of plans/cmcd-cml-refactor/plan.md (sub-phase E (c)-class
classifications C1, C2, C3 from diff-test-classification.md).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sub-phase E classifications (a)/(b) per plan.md Task 1.12 — tests now
match CML's spec-conformant wire output:

(a)/(b) string vs token formatting (~73 sites)
  Old: e="ps", e="m", e="pe", e="pc", e="t", e="um", e="rr",
       sta="s", sta="p", sta="r", sta="a", sta="d", sta="e",
       sta="w", sta="k"
  New: e=ps, e=m, e=pe, e=pc, e=t, e=um, e=rr,
       sta=s, sta=p, sta=r, sta=a, sta=d, sta=e, sta=w, sta=k
  CML treats `e` and `sta` values as RFC 8941 SFV tokens, not
  strings. shaka's old encoder emitted everything as quoted strings
  for non-token-shaped values; the SFV-aware emitter is correct
  per CTA-5004-B.

(b) v=2 auto-added in V2 output (8 sites)
  CML's prepareCmcdData unconditionally appends `v=2` when version
  is V2 (per CTA-5004-B § 4.1, V2 output MUST include `v`). Old
  shaka emitted `v` only when explicitly present in `data`. Tests
  that previously asserted `not.toContain('v=2')` (in cases where
  the user filtered `v` out via includeKeys) now assert
  `toContain('v=2')` — CML's auto-add takes precedence over
  includeKeys for the version field. Tests with no explicit `v`
  in `data` (escape test, single-key header test) gained `,v=2`
  in expected.

(b) ts no longer in request-mode output (1 site)
  Old shaka set data.ts = Date.now() in applyRequestSegmentData
  and emitted it in request-mode CMCD; per CTA-5004-B § 4.1 `ts`
  is event-mode-only. CML's request-mode filter correctly drops
  it. Inverted "includes ts for segment requests" to "does not
  include ts for segment requests".

(a)/(b) nor V2 inner-list format and event-mode no-baseUrl (1 site)
  Old: nor="next-seg.m4v" (V1 string format)
  New: nor=("https://test.com/next-seg.m4v") (V2 inner-list with
  absolute URL because event-mode reports skip baseUrl when the
  collector is at a different origin from the segment).

(b) urlToRelativePath unit tests already removed in d8d614c (Task
  1.10) since the helper is gone.

After these updates, all 124 CMCD-filtered tests pass; the broader
--quick suite reports 2995 success / 0 fail.

Per Task 1.12 of plans/cmcd-cml-refactor/plan.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update plan.md "Current status" section: sub-phase E (Tasks 1.11 +
1.12) ✅ complete after this session, sub-phase F (Tasks 1.13-1.14:
demo smoke + Phase 1 PR) is the resume point.

Add a "Sub-phase E landing notes" subsection capturing the three
(c)-class adapter fixes (reportingMode threading, prepare-once for
header shards, version threading) and the five (a)/(b)-class
wire-format alignments updated in test assertions (nor V2
inner-list, v=2 always-present, ts request-mode drop, e/sta token
vs string, event-mode nor-stays-absolute).

Replace "Resuming work — sub-phase E prep" with "sub-phase F prep":
the PR description scaffolding now lives in the three landing-notes
sections and "Key architectural decisions" subsection. Phase 1 ships
three intentional wire-format changes — nor root-relative, 'ld'/'lh'
dropped, V2 SFV-conformant encoding.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sub-phase F (Task 1.13: demo smoke test) ✅ complete — Phase 1 of the
CMCD CML refactor is now fully done. End-to-end verification on
bbb-dark-truths/dash.mpd via Claude Preview-driven local HTTP server,
both transmission modes:

  Query mode:  ?CMCD=cid="…",ot=m,sf=d,sid="…",sn=N,su,v=2
               (v=2 always present, ts absent in request mode, sf=d
               for DASH (no 'ld'), ot/sf/st as bare SFV tokens, cid/sid
               as quoted strings, su as bare-token boolean)
  Header mode: CMCD-Object: ot=m
               CMCD-Request: mtp=N,sn=N,su
               CMCD-Session: cid="…",sf=d,sid="…",v=2
               (v=2 only in CMCD-Session, not in Object/Request/Status —
                the C2 sub-phase E fix verified end-to-end)

Zero JS console errors during the demo session. Cross-origin storage.
googleapis.com blocks CMCD-* via CORS preflight; smoke test stripped
the headers in a request filter so segment fetch could complete. Real
deployments configure their own CDN to allow CMCD-* preflight — this
is browser-spec behavior, not a shaka issue.

Plan changes:
- Mark Phase 1F ✅ in the status table; resume point is now Phase 2.
- Replace "Phase 1 ships as PR after this" with a "PR strategy
  (revised 2026-04-28)" note in Plan structure: per user direction,
  per-phase PRs are deprecated. Single PR off feat/cmcd-cml-refactor
  after Phase 3 lands. Sub-phase boundaries still gate work
  internally (build/check.py + test.py per sub-phase).
- Replace the "sub-phase F prep" subsection with a "Phase 2 prep"
  subsection. Phase 2 is pure dedupe (Tasks 2.1-2.6: map shaka enums
  to CML, replace internal references, alias-export StreamingFormat).
- Add a "Sub-phase F landing notes" subsection parallel to D and E,
  capturing the smoke-test sample wire output for the eventual all-
  phases PR description.

New file:
- `plans/cmcd-cml-refactor/phase-1-pr-draft.md` — Phase 1 portion of
  the PR description scaffolding. Originally intended as a temporary
  scratch file (per Task 1.12 step 5), now persisted as scaffolding
  that Phase 2 + 3 sections will accrete to before the all-phases PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2 of the CMCD CML refactor: replace shaka.util.CmcdManager's
internal enums and key arrays with cml.cmcd.* equivalents. Pure dedupe;
no behavioral change.

Migrated (1:1 value match):
  shaka.util.CmcdManager.ObjectType        → cml.cmcd.CmcdObjectType
  shaka.util.CmcdManager.StreamType        → cml.cmcd.CmcdStreamType
  shaka.util.CmcdManager.Version.VERSION_2 → cml.cmcd.CMCD_V2 (= 2)
  shaka.util.CmcdManager.CmcdMode          → cml.cmcd.CmcdReportingMode
  shaka.util.CmcdManager.CmcdKeys.V1Keys   → cml.cmcd.CMCD_V1_KEYS

Migrated (set differs but consolidation is permissive):
  shaka.util.CmcdManager.CmcdKeys.V2RequestModeKeys ∪ V2CommonKeys
                                           → cml.cmcd.CMCD_REQUEST_KEYS
  shaka.util.CmcdManager.CmcdKeys.V2EventModeKeys ∪ V2CommonKeys
                                           → CMCD_REQUEST_KEYS
                                             ∪ CMCD_RESPONSE_KEYS
                                             ∪ CMCD_EVENT_KEYS
                                             (mirrors CML's
                                              `isCmcdEventKey`)
  shaka.util.CmcdManager.CmcdKeys.CmcdV2Events
                                           → Object.values(
                                               cml.cmcd.CmcdEventType)
                                             (CML's 17 events
                                             — superset of shaka's 10)
  shaka.util.CmcdManager.CmcdKeys.CmcdV2PlayStates
                                           → Object.values(
                                               cml.cmcd.CmcdPlayerState)
                                             (10 values, exact match)

Inlined / dropped:
  shaka.util.CmcdManager.CmcdV2Constants.TIME_INTERVAL_DEFAULT_VALUE
    inlined as `10`. CML's CMCD_DEFAULT_TIME_INTERVAL is `30`; Phase 3
    will switch to CmcdReporter, which uses the CML default natively.
  shaka.util.CmcdManager.CmcdV2Keys.TIMESTAMP — inlined as the literal
    'ts' (CML doesn't expose a dedicated constant for the timestamp
    key name).
  shaka.util.CmcdManager.CmcdV2Keys.TIME_INTERVAL_EVENT
                                           → cml.cmcd.CMCD_EVENT_TIME_INTERVAL

StreamingFormat: literal `@enum` definition retained — Closure's
clutz TypeScript-defs generator and shaka's generateExterns.js both
reject `@export`ed enums whose RHS is a non-`ObjectExpression`. The
4 values match cml.cmcd.CmcdStreamingFormat exactly (DASH/HLS/SMOOTH/
OTHER → 'd'/'h'/'s'/'o'); a new unit test asserts the value-identity
so Phase 3 can rely on it.

Internal type annotations referencing the now-deleted shaka enums
(`@private {(shaka.util.CmcdManager.StreamingFormat|undefined)}`,
return type of `getStreamFormat_`) updated to refer to
`cml.cmcd.CmcdStreamingFormat` directly. The public-facing
`shaka.util.CmcdManager.StreamingFormat` symbol stays.

`getStreamFormat_` now returns `cml.cmcd.CmcdStreamingFormat.DASH/HLS`
directly. Public callers see the same string values; this also keeps
the `cml.cmcd.CmcdStreamingFormat` `goog.require` in active use.

Net −74 LoC in cmcd_manager.js; +12 LoC for the value-identity test.
build/check.py exits 0; build/test.py --filter Cmcd: 125/125 pass
(the new value-identity test included); full --quick suite: 2996/2996
pass; demo smoke test (DASH bbb-dark-truths) confirms wire format
unchanged across 19 segment requests.

Per Tasks 2.1-2.5 of plans/cmcd-cml-refactor/plan.md. Task 2.6 (open
Phase 2 PR) deferred per single-branch / single-PR strategy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
littlespex and others added 30 commits May 5, 2026 09:52
…roject#10043)

Closes shaka-project#10042

Adds a `textDisplayer.suspendRenderingWhenHidden` config flag that gates
the IntersectionObserver-based render suspension introduced in shaka-project#9545.

- Defaults to `true` (existing behavior preserved for browsers/desktop).
- Defaults to `false` on TV devices (detected via
`shaka.device.DeviceFactory.getDevice().getDeviceType() ===
DeviceType.TV`).

Some TV browsers (e.g. older Tizen WebKit) misreport
`IntersectionObserver` visibility for transformed/absolute-positioned
player containers, leading to permanently suspended caption rendering.
Disabling suspension on TVs sidesteps the platform bug at the cost of
running one DOM update per `captionsUpdatePeriod` (default 0.25s) while
the player is off-screen.

Externs updated at `externs/shaka/player.js`. When the flag is `false`,
`applyVisibility_` short-circuits to "always visible" so the IO observer
can never suspend rendering.

---------

Co-authored-by: Álvaro Velad Galván <ladvan91@hotmail.com>
…a-project#10069)

Co-authored-by: Álvaro Velad Galván <ladvan91@hotmail.com>
…ect#10070)

This PR removes slice for a simpler truncation - especially good for
`merge` because we just truncate from the back. No need for allocation
for none of them anyway so this is an easy win
…#10073)

Region elements were cached by an ID that omitted regionAnchorX and
regionAnchorY, causing regions that shared the same viewport anchor but
differed in region anchor to reuse a previously cached DOM element
positioned incorrectly. Include both region anchor values in the
generated ID so each unique anchor combination gets its own element.

Issue: shaka-project#2583
…-project#10075)

This PR reduces steady-state work during HLS live/DVR playlist refreshes
by avoiding already known segments' re-construction. Before this, on a
live playlist update, the parser would still call
createSegmentReference_() for every segment, though only the newly
appended tail was actually merged into the segment index.
…t#10086)

Ensures the GapJumpingController identifies gaps even if the playhead
hasn't technically exited the previous buffered range
Upstream cml-cmcd (Common Media Library) v2.3.0 → v2.3.2, plus three
unrelated pre-existing regressions on the branch that were blocking
check.py and the CmcdManager unit test.

Pin update:
  tag    cmcd-v2.3.0 → cmcd-v2.3.2
  commit 22390e35dfbbe1e53d15648d3aace99cdf71f9dd
       → 244ecf05132ae1d1dc9cbd01479fd88f7695dbce

Upstream changes ported:

  - v2.3.1 setInterval leak fix in CmcdReporter (svta/cml#361):
    * New `disarmInterval_` and `disposeEventTarget_` helpers.
    * `start()` disarms any existing timer before arming a new one,
      eliminating the leak from repeated start() calls.
    * `stop()` routes through `disarmInterval_` for symmetry.
    * `sendEventReport_` HTTP 410 branch calls `disposeEventTarget_`
      (replaces local commit 67986bf's inline clear+queue-drain,
      whose queue=[] step is unnecessary once the interval is cleared).
    * `sendEventReport_` param renamed `target` → `config` to match the
      Map key type and the dispose call site.

  - v2.3.2 nor relative-path handling when baseUrl is set (svta/cml#365):
    * `cml_utils.js`: harden `urlToRelativePath` with try/catch so
      already-relative input is returned unchanged; add `getBaseUrl`
      (origin + directory) utility.
    * `cmcd_formatter_map_const.js`: nor formatter now wraps options.
      baseUrl with `getBaseUrl(...)` before calling urlToRelativePath.
    * `cmcd_reporter.js`: `createRequestReport` passes the full
      `report.url` to `createEncodingOptions_` instead of `url.origin`,
      so the resulting nor paths are sibling-relative to the current
      request URL per spec.

  - v2.3.1 doc-link text update (svta/cml#355) is a no-op for the
    vendored copy: our `@see` blocks omit the human-readable label.

Three unrelated pre-existing regressions folded in:

  - `cmcd_manager.js:519`: 1b0c578 introduced
    `goog.asserts.assert(this.networkingEngine_, ...)` against a field
    that does not exist on the class. Now asserts the local
    `networkingEngine` const, and adds the missing
    `goog.require('goog.asserts')` (compile error
    JSC_MISSING_REQUIRE_IN_PROVIDES_FILE without it).

  - `cmcd_reporter.js:144`: 1fab16c removed the `|| defaultRequester`
    fallback but left the constructor's `requester` param typed
    optional `(...)=`, producing JSC_TYPE_MISMATCH when assigning to a
    non-optional field. Param is now required; shaka's adapter always
    supplies a requester via NetworkingEngine. Documented in SUMMARY.

  - `test/util/cmcd_manager_unit.js:930`: 1b0c578 corrected the
    RequestType from TIMING (4) to CMCD (9) in production but did not
    update the unit test, which kept asserting TIMING.

SUMMARY.txt: pin updated; `getBaseUrl` documented in cml-utils shim
listing; stale `defaultRequester` / `fetch` whitelist reference
removed (defaultRequester was removed in 1fab16c).

Verification:
  python3 build/check.py --force      → exit 0
  python3 build/test.py --quick       → 2963 SUCCESS, 0 failed
                                        (CmcdManager: 63/63 SUCCESS)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants