Add encryption support#84
Conversation
| - `pairing` - server is performing a [pairing](#pairing) handshake | ||
| - `playback` - server needs client for active or upcoming playback | ||
| - `management` - server is opening a dedicated [management](#management) session - see that section for preconditions and constraints | ||
| - `active_roles?`: string[] - versioned roles that are active for this client (e.g., `player@v1`, `controller@v1`). Required when `connection_reason` is `'playback'`; absent otherwise. |
There was a problem hiding this comment.
Before this PR a discovery connection was a full connection. User presses play on the speaker and music just starts. The connection_reason field was only there to not kick out other servers that may already play music.
Now discovery has no active_roles so the server has to reconnect with 'playback'. That means a new Noise handshake and the clock-sync filter resets. Playback starts delayed and out of sync until it converges again.
Also L441 says active_roles is absent on discovery but L459 says servers always activate the preferred role.
There was a problem hiding this comment.
Honestly it was a little weird to begin with that you could use connection reason discovery to do normal playback.
I had here at first a proper multi-connection handling, but it added a little bit of complexity. Consider for example that a dynamic PIN needs a display/speaker and so cannot be active at the same time as playback is, two management connections at the same time are likely a bad idea too, etc.
So to keep it simple I just limited it to a single connection we had before with the idea that we can extend it in a follow-up.
But if we want, we can try tackling it here already.
Although maybe a simpler alternative for now would be to allow clients to dynamically change their connection_reason.
There was a problem hiding this comment.
It's called connection_reason since it's the reason why the server initially connected.
Could we just allow streams and management commands if the server has the required trust level?
That way the field stays a priority signal without becoming a hard gate.
There was a problem hiding this comment.
Is there a security risk to just always include the active_roles list? Or at least include it for a discovery message so the existing behavior is preserved?
| - `storage_exhausted` (client) - client cannot persist a new pairing record and has no fallback policy | ||
| - `user_cancelled` (client) - operator aborted the pairing through a local UI | ||
|
|
||
| ## Management |
There was a problem hiding this comment.
The management section adds 13 commands plus result handling and persistence. The rest of the spec is optimized to keep client implementations as minimal as possible (requires the server to support everything, opt-in roles, player chooses what audio format it can support).
Only set-name is something a normal user would ever touch. Most of the rest are admin/installer use cases I'm struggling to picture an end-user reaching for. I mean things like list-records, add-record, remove-record, list-pairing-psks are really advanced for an average users.
There was a problem hiding this comment.
Users wouldn't necessarily be exposed to those raw commands directly, they'd be used by servers in the background. There are a number of use cases, for example:
- Disabling PINs (Matter does this automatically after first successful pairing, we could do the same).
- Unpairing servers that are no longer in use.
- Sharing a client with another server (e.g. by adding a Pairing PSK to the client and passing it along).
The general idea is to keep the protocol simple and flexible, providing straightforward mechanisms and leaving policy decisions like the above to the servers.
There was a problem hiding this comment.
Adding 13 commands is anything but keeping the protocol simple.
The 4 commands for just the Pairing PSK need to be implemented by every client. If we want to keep the generate-PSK workflow, I think even a single management/refresh-pairing-token should be enough. What's the use case of having multiple long-lived pairing tokens and being able to revoke them individually?
Same goes for the records family of commands. Do we need all 4 of them? management/add-record is redundant with the existing pairing mechanisms.
We could maybe even reduce it to one or two commands (a management/release-ownership and maybe a management/unpair-other-servers), but that looks a bit less flexible to me.
Do we need management/clear-pin-lockout? Would a user know to search for this in their main server's settings? I for one wouldn't look there for such an option, maybe it's best to just drop it then? Restarting the speaker or using PSKs is still an option to get the speaker paired.
There was a problem hiding this comment.
The number management messages is my biggest hang up at the moment, this is a significant amount of new messages/complexity to the protocol. I think any reduction we can do here is very beneficial.
|
|
||
| ### Unpaired Playback | ||
|
|
||
| A client MAY admit `'playback'` connections on the Sentinel PSK from servers with no pairing record. The session's [trust level](#definitions) is `'none'`, so [management](#management) operations remain unavailable. The default is the manufacturer's choice; clients that support the [`controller`](#controller-messages) role SHOULD default to disabled. The toggle is exposed at runtime via [`management/set-pairing-config`](#server--client-managementset-pairing-config), and the client's current setting is advertised in [`client/hello`](#client--server-clienthello) as `unpaired_playback.enabled`. Servers must likewise allow their operator to enable or disable initiating unpaired playback; the server's setting is not exchanged on the wire. |
There was a problem hiding this comment.
clients that support the
controllerrole SHOULD default to disabled.
I imagine most speakers/players will also have a play/pause button.
So when going with the recommendation of this spec, you still need to pair almost all speakers before the first use.
To solve this we could restrict some roles to paired sessions only. Right now that would just be the controller@v1 role, but the same gating would also apply to some future roles like source@v1.
We have a mechanism to allow only some roles to be active. We just need to adjust this assumption to cover this case as well:
Note: Servers will always activate the client's preferred version of each role. Checking
active_rolesis only necessary to detect outdated servers or confirm activation of application-specific roles.
| - Prefer the server matching the stored last played server | ||
| - If neither matches (or no history), keep the existing server | ||
| - An in-flight **pairing** is not displaced by another incoming **pairing**. | ||
| - An incoming **discovery**, when an existing **discovery** is held, is admitted iff its `server_id` matches the last-played server (and the existing one's does not); otherwise the existing is kept. |
There was a problem hiding this comment.
Is "iff" a typo or are you using the short hand for "if and only if"? If the shorthand, write it out.
| - `pairing` - server is performing a [pairing](#pairing) handshake | ||
| - `playback` - server needs client for active or upcoming playback | ||
| - `management` - server is opening a dedicated [management](#management) session - see that section for preconditions and constraints | ||
| - `active_roles?`: string[] - versioned roles that are active for this client (e.g., `player@v1`, `controller@v1`). Required when `connection_reason` is `'playback'`; absent otherwise. |
There was a problem hiding this comment.
Is there a security risk to just always include the active_roles list? Or at least include it for a discovery message so the existing behavior is preserved?
|
|
||
| Change the client's friendly `name` (the value reported in subsequent [`client/hello`](#client--server-clienthello) connections and via mDNS). | ||
|
|
||
| - `name`: string - new friendly name; non-empty, length cap 128 UTF-8 bytes |
There was a problem hiding this comment.
We have a whole shared key setup for devices with low flash storage size, but now we will require 128 bytes dedicated just to the name?
| - `storage_exhausted` (client) - client cannot persist a new pairing record and has no fallback policy | ||
| - `user_cancelled` (client) - operator aborted the pairing through a local UI | ||
|
|
||
| ## Management |
There was a problem hiding this comment.
The number management messages is my biggest hang up at the moment, this is a significant amount of new messages/complexity to the protocol. I think any reduction we can do here is very beneficial.
Add encryption support