feat: Add opt-in CAT token authorization#264
Conversation
|
Docker build validation completed after refreshing the CI-style bookworm moxygen bundle:
Result: Docker image built successfully as |
|
Adding some unwritten documentation so I can understand the flow -- other reviewers may find helpful: Startup: YAML → Runtime ObjectsConnection Time: CLIENT_SETUP → Session AuthPer-Request: Subscribe / Publish / Fetch / etc.Session Teardown |
Auth Module Review:
|
Auth Config ReviewPattern ConsistencyThe structure follows existing conventions well. One inconsistency: every optional field in afrind: should we also support hmac key files? This can be done as a follow up unless its trivial here since there's already a lot going on. CriticalZero test coverage for all auth config validation and resolution
Important
Description strings for auth booleans don't document their defaults |
Auth Wiring Review: Relay + Context + ServerWhat changedServer layer ( Context layer (
Relay layer (
Correctness
Simplicity / Moxygen Follow-upThe Recommendation: Add a
The 6 Test Coverage
|
suhasHere
left a comment
There was a problem hiding this comment.
Thanks for the work on this @mondain . here are few high level thoughts. Can we split this PR into 3 parts
- Auth Config
- Abstract Auth API
- CAT Auth bits.
On the last one, was wondering if you got chance to checkout this repo : https://github.com/Quicr/catapult
It implements cat4moq and dpop with all the crypto bits . I would encourage us to use this library instead of reinventing the wheel , if possible.
@suhasHere made 1 comment.
Reviewable status: 0 of 17 files reviewed, all discussions resolved (waiting on afrind, akash-a-n, gmarzot, michalhosna, Oxyd, peterchave, and TimEvens).
mondain
left a comment
There was a problem hiding this comment.
@suhasHere I'll do a deeper dive into catapult, I'm all for not reinventing the wheel and on the surface it looks good to me.
@mondain made 1 comment.
Reviewable status: 0 of 17 files reviewed, all discussions resolved (waiting on afrind, akash-a-n, gmarzot, michalhosna, Oxyd, peterchave, and TimEvens).
|
Docker focused validation passed:
The Docker test configure needed the same gflags workaround used by the local build script: -DCMAKE_FIND_LIBRARY_SUFFIXES=".so;.a"
-DGFLAGS_SHARED=ON |
I'm creating a separate PR off this work to provide the route for Catapult; separating the original PR into n new PR does not excite me for several reasons, but I'll go with the group, if others think its prudent. |
afrind
left a comment
There was a problem hiding this comment.
Ok, many small comments I'm sure claude can handle. I don't think this really needs a further split. It is a big review but I made it through. @suhasHere are you ok with catapult as a follow up?
@afrind reviewed 18 files and all commit messages, and made 18 comments.
Reviewable status: all files reviewed, 13 unresolved discussions (waiting on akash-a-n, gmarzot, michalhosna, mondain, Oxyd, peterchave, and TimEvens).
deps/moxygen line 1 at r3 (raw file):
Subproject commit 36c314edb9b42345fd4ad918b463743edad21a87
What moxygen change is required to support this?
CMakeLists.txt line 38 at r3 (raw file):
find_package(moxygen REQUIRED) find_package(OpenSSL REQUIRED)
Everyone cool with OpenSSL dep here? @suhasHere
config.example.yaml line 60 at r3 (raw file):
# auth: # Optional: enable CAT-style authorization for this service # enabled: true # token_type: 0 # MOQT AUTHORIZATION_TOKEN token type
token_type 0 is reserved to mean "nothing" in the spec? So I think we cannot specify 0 (config loader should error)
config.example.yaml line 64 at r3 (raw file):
# - id: "key-1" # secret: "replace-with-secret" # require_setup_token: true
Do each of these need descriptive comments? I'm not sure what the convention is
src/MoqxRelay.cpp line 133 at r3 (raw file):
} auto session = MoQSession::getRequestSession();
Can we pass the session in? Seems like it's already known/needed elsewhere
src/MoqxRelay.cpp line 134 at r3 (raw file):
auto session = MoQSession::getRequestSession(); auto token = authVerifier_.allowRequestTokenOverride()
If override is not on and the request does contain a new token, should we log/error? Or it's expected behavior?
src/MoqxRelay.cpp line 142 at r3 (raw file):
return folly::makeUnexpected(grants.error()); } if (auth::allows(grants.value(), action, ns, trackName)) {
Seems like we could have a single auth::allows call by consolidating this bit with below. eg:
if (token) {
grants = verify();
} else {
grants = sessionAuth_.find();
}
if (!auth::allows(...)) {
}
src/MoqxRelay.cpp line 257 at r3 (raw file):
auto authRes = authorize(auth::Action::PublishNamespace, pubNs.params, pubNs.trackNamespace); if (authRes.hasError()) { co_return folly::makeUnexpected(PublishNamespaceError{
I think we should log in all these branches - maybe DBG1?
src/auth/Auth.h line 72 at r3 (raw file):
folly::Expected<Grants, AuthError> verify(const moxygen::AuthToken& token) const; // Internal v1 envelope used by tests and by local token issuers:
Hm, maybe this could be in a subclass? Also "local token issuers"?
src/auth/Auth.h line 87 at r3 (raw file):
const Grants& grants, Action action, const moxygen::TrackNamespace& ns,
Maybe split into two signatures, TrackNamespace and FullTrackName, that both call allowsImpl with this signature?
src/auth/Auth.cpp line 93 at r3 (raw file):
uint32_t readU32BE(std::string_view data, size_t offset) { return (static_cast<uint32_t>(static_cast<unsigned char>(data[offset])) << 24) |
is this just an implement of ntohl or folly::endian::big/little?
src/auth/Auth.cpp line 99 at r3 (raw file):
} void appendU32BE(std::string& out, uint32_t value) {
Same you can do this with a helper to flop your integer and then memcpy, right?
src/auth/Auth.cpp line 137 at r3 (raw file):
} class CborReader {
Minor preference to move CborReader to its own file in case we ever need it for something else.
src/auth/Auth.cpp line 179 at r3 (raw file):
uint8_t major = 0; uint64_t len = 0; if (!readType(major, len) || (major != 2 && major != 3) || len > data_.size() - pos_) {
Can we add symbolic constants throughout for these magic numbers?
src/auth/Auth.cpp line 315 at r3 (raw file):
} bool parseScope(CborReader& reader, Scope& scope) {
A comment describing the format would make it easier to validate the code is correct for the non-cat experts
src/auth/Auth.cpp line 473 at r3 (raw file):
const auto nsBytes = canonicalNamespace(ns); const auto trackBytes = trackName.value_or(std::string_view()); for (const auto& scope : grants.scopes) {
Can this be a potential performance concern? Could someone supply a token with a large number of scopes? I suppose it's bound by the MOQT 64k control message len limit, but would a hash lookup be better?
test/config/ConfigResolverTest.cpp line 467 at r3 (raw file):
cfg.services.value().clear(); auto auth = makeAuthConfig(); auth.token_type = std::optional<uint64_t>{uint64_t{1} << 62};
How will we remember to adjust this when we go to MOQT varints?
The MoqxRelay constructor gained a config::AuthConfig parameter in position
3 (before maxDeselected). MoqxTrackFilterTest still used the old argument
order, so the uint64_t landed in the AuthConfig slot and failed to compile,
breaking every linux/macos/asan/conformance CI job. Pass config::AuthConfig{}
explicitly at all four make_shared<MoqxRelay>() call sites.
Extract the internal CBOR decoder from Auth.cpp into a self-contained header-only src/auth/CborReader.h (namespace openmoq::moqx::auth). Auth.cpp now includes it; the claim-parsing helpers are unchanged. No build wiring needed since moqx_core picks up headers via the src/ include dir.
mondain
left a comment
There was a problem hiding this comment.
@mondain made 1 comment and resolved 1 discussion.
Reviewable status: 5 of 20 files reviewed, 12 unresolved discussions (waiting on afrind, akash-a-n, gmarzot, michalhosna, Oxyd, peterchave, and TimEvens).
- auth: split allows() into TrackNamespace and FullTrackName overloads over a shared allowsImpl; update relay and tests accordingly - relay: consolidate authorize() to resolve grants from one source then a single allows() check; pass session in instead of getRequestSession(); log at DBG1 on auth failures and when ignoring a request token override - auth: use folly::Endian for the u32be claims-length helpers - auth: name CBOR major-type / additional-info constants in CborReader - auth: document the CAT "moqt" claim scope CBOR layout; clarify the signForTest comment (test/tooling helper, relay never issues tokens) - auth: reject duplicate well-known claim keys (exp/moqt/moqt-reval) - config: name the QUIC varint upper bound; clarify token_type 0 is a valid out-of-band type; add descriptive comments to auth keys in config.example - test: restore CborReaderTest.cpp and add duplicate-claim-key coverage
|
[AI] Done in f34554c. There are now two public overloads -- |
|
[AI] Done in f34554c. |
|
[AI] Expected behavior, now logged. When |
|
[AI] Done in f34554c. |
|
[AI] Done in f34554c, centralized rather than per-branch: |
|
[AI] Effectively yes -- replaced with folly in f34554c. |
|
[AI] Yes -- done in f34554c. |
|
[AI] Done in f34554c. Added a |
|
[AI] Done in f34554c -- added a block comment above |
|
[AI] Added inline descriptive comments for each auth key in |
|
[AI] Checked the spec on this. Per draft-ietf-moq-transport (AUTHORIZATION_TOKEN "Token Type" definition), Type 0 is "reserved to indicate that the type is not defined in the [IANA] table and is negotiated out-of-band between client and receiver." So 0 is a valid, usable value meaning a private/out-of-band token type -- not "nothing." Since our CAT-style token is exactly such an out-of-band scheme, 0 is a legitimate choice, so the loader accepts it as-is (an omitted token_type also resolves to 0, so the two behave identically). |
|
[AI] "local token issuers" was misleading -- the relay only ever verifies tokens, it never issues them. |
|
[AI] I think linear scan is fine here. Scope count is bounded by the 64k control-message limit you noted (and realistically a token carries a handful of scopes). A hash lookup doesn't really fit the model, though: scope entries aren't exact keys -- they're prefix/suffix/contains/exact match rules evaluated against the canonicalized namespace and track bytes, so there's no single key to hash on; we'd still scan candidate scopes. If this ever shows up hot, the better lever would be indexing scopes by Action first (skip scopes that don't grant the requested action before running the match rules). Left as-is for now. |
|
[AI] The 2^62 bound isn't a moqx policy that will drift -- it's the maximum value encodable by a QUIC variable-length integer (RFC 9000 Section 16), which MOQT inherits and which is fixed by the wire format. So there's nothing to remember to bump as the drafts evolve. To make that explicit, f34554c names the constant |
|
[AI] For context: OpenSSL is used here for HMAC-SHA256 token verification (EVP_MAC on OpenSSL 3.x, HMAC otherwise). It's already pulled in transitively by the moxygen -> proxygen -> fizz/folly stack, so this |
|
[AI] To add to the earlier note: this feature doesn't require any moxygen-side change -- it builds entirely on moxygen's existing AUTHORIZATION_TOKEN support ( |
|
[AI] @afrind the r3 inline comments are addressed in f34554c (on top of d2b9e59). Summary:
Replied inline on the discussion-only threads (OpenSSL dependency, token_type 0, the subclass question, scope-scan performance, the varint bound, and the moxygen submodule). The catapult-vs-in-tree and PR-split questions I left for @mondain / @suhasHere. |
TimEvens
left a comment
There was a problem hiding this comment.
Good starting point, but some comments/nits.
@TimEvens reviewed 21 files and all commit messages, and made 5 comments.
Reviewable status: all files reviewed, 15 unresolved discussions (waiting on afrind, akash-a-n, gmarzot, michalhosna, mondain, Oxyd, and peterchave).
CMakeLists.txt line 38 at r3 (raw file):
Previously, mondain (Paul Gregoire) wrote…
[AI] For context: OpenSSL is used here for HMAC-SHA256 token verification (EVP_MAC on OpenSSL 3.x, HMAC otherwise). It's already pulled in transitively by the moxygen -> proxygen -> fizz/folly stack, so this
find_package(OpenSSL)links what's already present rather than adding a new third-party dependency. HMAC needs some crypto provider, and OpenSSL is the standard one in this ecosystem. @suhasHere if you'd prefer a different provider we can revisit, but it'd likely be the same underlying lib. No code change here pending your call.
Should be fine for now, but we'll likely override the find to support boring, and others, which expose OpenSSL apis.
src/MoqxRelay.h line 313 at r6 (raw file):
const moxygen::TrackNamespace& ns, const std::shared_ptr<moxygen::MoQSession>& session, std::optional<std::string_view> trackName = std::nullopt
Name isn't always a string. Any reason not to use span const ?
src/MoqxRelay.cpp line 659 at r6 (raw file):
// receives namespace announcements as publishers connect. } if (incomingPeerID.empty()) {
Not super great that the detection if the subNs is via a peer or not is determined by getPeerRelayID().
src/auth/Auth.h line 23 at r6 (raw file):
namespace openmoq::moqx::auth { enum class Action : uint64_t {
Is it really an action? Seems more like Type or Scope...
auto session was declared twice in the same scope (lines 365 and 390), causing a conflicting declaration compiler error on all CI platforms.
CborReader stores a string_view, so passing a temporary std::string (e.g. uint64Cbor(v)) causes a dangling reference after the constructor returns. Store each temporary in a named variable so it outlives the CborReader. Manifests as test failures on Linux/ASAN; macOS happens to pass due to stack not being reused before the read.
afrind
left a comment
There was a problem hiding this comment.
@afrind reviewed 5 files, made 7 comments, and resolved 12 discussions.
Reviewable status: 13 of 21 files reviewed, 4 unresolved discussions (waiting on akash-a-n, gmarzot, michalhosna, mondain, Oxyd, peterchave, and TimEvens).
config.example.yaml line 60 at r3 (raw file):
Previously, mondain (Paul Gregoire) wrote…
[AI] Checked the spec on this. Per draft-ietf-moq-transport (AUTHORIZATION_TOKEN "Token Type" definition), Type 0 is "reserved to indicate that the type is not defined in the [IANA] table and is negotiated out-of-band between client and receiver." So 0 is a valid, usable value meaning a private/out-of-band token type -- not "nothing." Since our CAT-style token is exactly such an out-of-band scheme, 0 is a legitimate choice, so the loader accepts it as-is (an omitted token_type also resolves to 0, so the two behave identically).
validateAuthstill rejects values >= 2^62 (un-encodable as a varint). I clarified this in the example-config comment and invalidateAuthin f34554c. If you'd rather the example not imply 0 is special, happy to switch it to a non-zero placeholder.
I think 0 is wrong. Cat4moq should have an explicit type. moqx should only allow the types in its config that it is configured to validate.
That said, we can land this and follow up.
src/MoqxRelay.cpp line 134 at r3 (raw file):
Previously, mondain (Paul Gregoire) wrote…
[AI] Expected behavior, now logged. When
allow_request_token_overrideis false the grants established at session setup govern, and a per-request token is intentionally ignored rather than rejected (a client re-sending its token shouldn't fail the request). In f34554c we now log that at DBG1 for visibility instead of silently dropping it.
DBG seems ok
src/MoqxRelay.cpp line 659 at r6 (raw file):
Previously, TimEvens (Tim Evens) wrote…
Not super great that the detection if the subNs is via a peer or not is determined by
getPeerRelayID().
@TimEvens let's file a follow up issue -- though I believe that we need some way to indicate that peer relays are trusted (eg do not need to present tokens for auth to verify) or we need to configure the tokens the relay should use for peering request explicitly in config.
src/auth/Auth.h line 72 at r3 (raw file):
Previously, mondain (Paul Gregoire) wrote…
[AI] "local token issuers" was misleading -- the relay only ever verifies tokens, it never issues them.
signForTestis purely a test/tooling helper for building a v1 envelope. I reworded the comment in f34554c to say exactly that. On the subclass question: I'd lean against it for now -- there's a single envelope version, andverify()/signForTestshare that one layout, so a subclass would add a type without a second implementation to justify it. If we grow a real (non-test) token issuer later, extracting an interface at that point would be the natural time.
I'm on record as saying I don't love "test only" APIs, but I'll allow it
src/auth/Auth.cpp line 473 at r3 (raw file):
Ok, I'll note our top level requirements doc principles say:
- Performance is Paramount: Every design decision must prioritize efficient processing, high throughput.
But the token was already signed by the issuer, so it maybe we can handwave past this
test/config/ConfigResolverTest.cpp line 467 at r3 (raw file):
Previously, mondain (Paul Gregoire) wrote…
[AI] The 2^62 bound isn't a moqx policy that will drift -- it's the maximum value encodable by a QUIC variable-length integer (RFC 9000 Section 16), which MOQT inherits and which is fixed by the wire format. So there's nothing to remember to bump as the drafts evolve. To make that explicit, f34554c names the constant
kQuicVarintExclusiveUpperBound(with a comment) invalidateAuth; the test deliberately uses1 << 62as the first un-encodable value.
The AI comment is wrong.
MOQT has shipped 64 bit varints in drafts 17 and 18. maybe we should delete the whole test, since it's not possible anymore?
deps/moxygen line 1 at r3 (raw file):
Previously, mondain (Paul Gregoire) wrote…
[AI] To add to the earlier note: this feature doesn't require any moxygen-side change -- it builds entirely on moxygen's existing AUTHORIZATION_TOKEN support (
moxygen::AuthToken,Parameters,TrackRequestParamKey::AUTHORIZATION_TOKEN). The submodule bump in this PR is just a routine sync toopenmoq/moxygenmain, not a change coupled to the auth work. If the concern is keeping the review clean, I can split the submodule bump into its own commit/PR so it isn't entangled with the auth changes.
Wait, I don't think we should hard code any bumps here.
afrind
left a comment
There was a problem hiding this comment.
@afrind reviewed 14 files and all commit messages, made 1 comment, and resolved 1 discussion.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on akash-a-n, gmarzot, michalhosna, mondain, Oxyd, peterchave, and TimEvens).
deps/moxygen line 1 at r3 (raw file):
Previously, afrind wrote…
Wait, I don't think we should hard code any bumps here.
nvm, reviewable artifact
Summary
Adds opt-in CAT-style token authentication and authorization for MOQT relay services.
Changes
Notes
This PR uses the existing feature/auth branch contents as requested, including branch-local commits outside the auth implementation commit.
Validation
This change is