feat: WiFi scale support — WebSocket control + events#54
Conversation
Extend the /snapshot WebSocket beyond a one-way weight stream into a
full control + event protocol so web/app clients reach feature parity
with BLE:
* Event broadcasts: button presses (sendWebsocketButton) and
power-off reasons (sendWebsocketPowerOff) wired into the same call
sites as their BLE/USB equivalents.
* Status frames: sendWebsocketStatus / sendWebsocketStatusAll publish
battery %, voltage, charging, timer, display, soft-sleep, LED, and
rate state. Periodic broadcast every WEBSOCKET_STATUS_NOTIFY_INTERVAL_MS.
* Commands accepted as text or JSON: status, events on/off, tare,
timer start/stop/zero, led <r,g,b>|off, display on/off,
low_power on/off, sleep on/off, power off.
* Selectable weight rate: 2 Hz / 5 Hz / 10 Hz, via "rate <hz>",
"interval <ms>", or JSON {rate_hz|hz|rate|interval_ms|...}.
Falls back to default on client disconnect.
* README: document the protocol (events, status, commands, rates).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The /snapshot WebSocket handler runs on the AsyncTCP task. Several
commands (display, low_power, sleep, power_off) were calling
u8g2.setPowerSave / u8g2.setContrast / digitalWrite(PWR_CTRL,...)
directly from that task while the main loop hammered the same
peripherals — under feature-test load at 10Hz this caused the device
to stall for ~4s and HTTP responses to time out mid-body.
* Add a portMUX-protected pending-action mask. The WS callback
queues the hardware op and updates the matching state flag
(b_u8g2Sleep / b_softSleep) synchronously so the status frame is
accurate; the main loop drains the mask early in loop() (before
the b_softSleep guard so SLEEP_OFF can wake the device).
* StopWatch.* is just bool/uint32 writes, so timer ops stay
synchronous in the callback.
* webserver.h: register handlers only once across stop/start
cycles, drop the duplicate addHandler(&websocket), and move
server.begin() to the end of startWebServer() so handlers are
in place before the server accepts requests.
* WS DATA callback: build the String with the (buf,len)
constructor instead of O(N²) char-by-char concatenation, and
only process complete unfragmented text frames.
* Use unsigned long for the WS broadcast lastUpdate to avoid
signed-subtraction wrap at the millis() 49.7-day rollover.
* Document why WiFi.setSleep(false) is the wrong call here:
tested it, caused 8% loss and HTTP all-timeout when BLE is
active because BT/WiFi share the 2.4GHz radio and the coex
layer needs WiFi sleep windows for BT slots.
Stress test (10Hz WS + HTTP every 1s + 5 cmd/s feature chaos, 120s):
pre-fix post-fix
max gap 4.18 s 0.90 s
HTTP fail 1 0
HTTP slow 2 0
Feature regression (60 WS commands) still 60/60 PASS.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Code review on PR decentespresso#54 caught two real bugs in the deferred-action fix: 1. The pending-mask let opposing actions accumulate. If a client sent "display off" then "display on" before the main loop drained, both bits were set and processWsPendingCmds ran them in source order — final hardware ended up off while b_u8g2Sleep reported on. Add a wsReplacePending(set, clear) helper that atomically supersedes the opposing bit, and use it for the three mutually-exclusive pairs (display, low_power, sleep). 2. The new WS state globals (weightWebsocketNotifyInterval, b_websocketEventsEnabled, b_websocketLowPower/LedEnabled, i_websocketLedR/G/B, t_lastWebsocketStatusUpdate) and the pre- existing b_softSleep / b_u8g2Sleep / b_powerOff that are now written from the AsyncTCP task are missing `volatile`. On dual- core ESP32-S3 the compiler may cache stale values in a register on the loop core across the WS gate check. Mark them volatile. Also fold in smaller doc/format cleanups from the same review: * Cast stopWatch.elapsed() (uint32_t) to (unsigned long) for the %lu format spec in sendWebsocketStatus / sendWebsocketStatusAll. Works today on ESP32 (both 32-bit) but technically a type mismatch. * Correct the wsQueuePending doc-comment (StopWatch is safe to mutate inline; only u8g2 and power-rail GPIOs need deferring). * Correct the processWsPendingCmds comment to note that b_powerOff is set here on purpose. * README: document the JSON rate forms accepted by the parser that weren't listed (rate_hz, hz, interval_ms). Re-verified: * 60/60 WS feature regression PASS * 120s freeze stress: max weight gap 0.60s (was 4.18s pre-fix), 0 HTTP failures, 0 slow, 590 commands sent Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Code reviewFound 1 issue, fixed in 74d7f0f:
5 below-threshold issues (50-74), all addressed in 74d7f0f
Verification after these fixes: 60/60 WS feature regression PASS, 120s freeze stress max weight gap 0.60s (4.18s pre-fix), 0 HTTP failures. False positives discarded: an Generated with Claude Code |
7c533a2 to
74d7f0f
Compare
The previous middleware returned 503 to any client past the first,
which meant opening the on-device web UI prevented a phone/desktop
app from connecting to the same scale. Drop the cap and let the
library handle multi-client.
* include/webserver.h: remove the count-checking middleware so any
handshake within the library's DEFAULT_MAX_WS_CLIENTS limit
succeeds.
* src/hds.ino: call websocket.cleanupClients() with no argument so
we inherit the library default (DEFAULT_MAX_WS_CLIENTS, which is
8 on ESP32). Plenty of RAM headroom — ~12-14 KB per client on
top of ~240 KB free heap.
* src/hds.ino WS_EVT_DISCONNECT: only reset the shared session
state (weightWebsocketNotifyInterval, b_websocketEventsEnabled,
t_lastWebsocketStatusUpdate) when the LAST client leaves
(server->count() == 0). Without this guard, one client
disconnecting would wipe rate/events state for any other client
still connected.
Verified with /tmp/ws_multi.py (5/5):
* Two simultaneous handshakes both succeed (was: 2nd got 503).
* Both clients receive the weight stream at the negotiated rate
(2Hz default, 10Hz after `rate 10`).
* Both clients see the periodic status broadcast after `events on`.
* Disconnecting client A leaves client B's stream + state intact.
Single-client feature regression still 60/60 PASS.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
A bare hds.local A record isn't reliably resolvable on Android (the
OS resolver doesn't do mDNS and a direct app-socket query gets the
unicast reply dropped). A DNS-SD service is discoverable through
Android's NsdManager, Bonjour, and Avahi.
Register `_decentscale._tcp` on port 80 (the HTTP/WS port) alongside
the existing hostname, with a friendly instance name and TXT records
so a discovery UI can label the device and connect without guessing:
instance: "Half Decent Scale"
port: 80
TXT: fw=FW: 3.0.9 model=hds proto=ws path=/snapshot
Uses the Arduino ESPmDNS wrapper (same ESP-IDF mdns component as the
raw mdns_service_add API). config.h is included for FIRMWARE_VER;
cast to const char* to disambiguate the addServiceTxt overloads.
Verified on device:
dns-sd -B _decentscale._tcp -> finds "Half Decent Scale"
dns-sd -L "Half Decent Scale" ... -> hds.local.:80
path=/snapshot proto=ws
model=hds fw=FW: 3.0.9
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
tadelv
left a comment
There was a problem hiding this comment.
I'd like ask you to move the websocket code into a file that is not hds.ino. Some separation of concerns would be nice to have.
|
|
||
| WiFi.begin(params.getSSID(), params.getPass()); | ||
| WiFi.setTxPower(WIFI_POWER_18_5dBm); | ||
| // Leaving WiFi modem-sleep at the Arduino default (WIFI_PS_MIN_MODEM): |
There was a problem hiding this comment.
this comment here seems redundant?
| if (command == "rate" && doc["value"].is<const char *>()) { | ||
| return setWebsocketRateFromValue(client, doc["value"].as<String>()); | ||
| } | ||
| if (command == "led" && |
There was a problem hiding this comment.
LedRGB seems to have snuck in somehow, since it doesn't really do anything, could you remove all the references and code related to it?
| if (websocket.availableForWriteAll() > 0) { | ||
| if (!b_adc_recovery_active) { | ||
| websocket.printfAll("{ \"grams\": %.2f, \"ms\": %lu }", f_displayedValue, current); | ||
| websocket.printfAll("{\"grams\":%.2f,\"ms\":%lu}", f_displayedValue, current); |
There was a problem hiding this comment.
availableForWriteAll() returns the minimum writable across all clients, so this all-or-nothing gate couples every client to the slowest one: if any single client's TX buffer is full (slow/dead socket), the weight broadcast — and the periodic status below — is skipped for everyone. That defeats the PR's multi-client independence goal. Consider iterating clients and sending per-client (check each client->canSend()) instead of gating the shared printfAll.
| String msg((const char *)data, len); | ||
| Serial.print("Websocket recv: "); | ||
| Serial.println(msg); | ||
| handleWebsocketRateCommand(client, msg); |
There was a problem hiding this comment.
Trust-model note: pre-PR this socket was read-only weight + tare. It now accepts power off, soft_sleep on, display off, led, timer, etc. with no authentication. Anyone on the LAN can power the scale off (and power off can't be woken remotely) or sleep it — easy griefing/DoS. If a trusted-LAN assumption is intended, please state it explicitly; otherwise consider gating the destructive commands.
| } | ||
|
|
||
| if (command == "timer") { | ||
| // StopWatch is just bool/uint32 state — safe to mutate from the AsyncTCP |
There was a problem hiding this comment.
"safe to mutate from the AsyncTCP task" isn't quite right. StopWatch is multi-field (running flag + start timestamp + accumulator), and it's also mutated from the main loop (hds.ino shot logic), BLE (ble.h), and USB (usbcomm.h). Concurrent start/stop/reset here plus elapsed()/isRunning() reads in the status frame can tear or read mid-update across cores. requestRemoteTare() already uses a portMUX for exactly this reason — the timer path warrants the same guarding rather than an assumption that bool/uint32 writes are atomic-as-a-group.
| return true; | ||
| } | ||
|
|
||
| if (command == "low_power") { |
There was a problem hiding this comment.
low_power on defers setContrast(0), but the button-press path sets setContrast(255) (hds.ino:94). After any button press the contrast reverts to max while b_websocketLowPowerEnabled stays true, so the status frame reports low_power:true that no longer matches the hardware.
| Serial.println("Websocket soft sleep on detected."); | ||
| // Set state flags synchronously so status reflects the requested mode. | ||
| // u8g2 + GPIO power-rail writes are deferred to the main loop. | ||
| b_softSleep = true; |
There was a problem hiding this comment.
soft_sleep on calls setPowerSave(1) (display off) but never sets b_u8g2Sleep = true. Since the status frame derives display_on from b_u8g2Sleep, it reports display_on:true while the OLED is physically off during soft sleep. soft_sleep off does set b_u8g2Sleep = false, so the pair is asymmetric — set b_u8g2Sleep = true here too.
Summary
Brings the on-device
/snapshotWebSocket up to feature parity with BLE so the web UI and a separate app (phone/desktop) can both drive the scale.New protocol (c0eb374)
{"grams","ms"}weight stream extended into a full control + event protocol.sendWebsocketButton) and power-off reasons (sendWebsocketPowerOff), wired into the same call sites as the BLE/USB equivalents.sendWebsocketStatus/sendWebsocketStatusAll) publishing battery %, voltage, charging, timer, display, soft-sleep, LED, and rate state — broadcast periodically every 5 s afterevents on, and on demand.status,events on/off,tare,timer start/stop/zero,led <r,g,b>|off,display on/off,low_power on/off,sleep on/off,power off.rate <hz>,interval <ms>, or JSON (rate_hz/hz/rate/interval_ms/{command:rate,value:…}).Freeze fix (1051681)
u8g2.setPowerSave/setContrastordigitalWrite(PWR_CTRL,…)directly raced the main loop's OLED draws and stalled the device for ~4 s under feature-test load.portMUX-protectedwsPendingMaskdrained early inloop(). Hardware ops deferred; state flags updated synchronously so status frames stay accurate.include/webserver.h: handlers registered once across stop/start (server.end()doesn't clear the handler list),server.begin()moved to the end, removed a duplicateaddHandler(&websocket).String((const char*)data, len)instead of O(N²) char-by-char concat; only accepts complete unfragmented text frames.static unsigned long lastUpdate(was signed → wrong acrossmillis()rollover).wifi_setup.cppthatWiFi.setSleep(false)is the wrong call with BLE active (measured: 8 % packet loss, HTTP all-timeout because BT+WiFi share the 2.4 GHz radio and the coex layer needs WiFi's sleep windows for BT slots).Review feedback (74d7f0f)
wsPendingMask |= bitlet opposing actions accumulate, sodisplay offthendisplay oncould end up off while status reported on. AddedwsReplacePending(set, clear)for the three mutually-exclusive pairs.weightWebsocketNotifyInterval,b_websocketEventsEnabled,b_softSleep,b_u8g2Sleep, …)volatilefor dual-core ESP32-S3 cache coherence.stopWatch.elapsed()to(unsigned long)for%lu. Fixed two stale doc comments (wsQueuePendingandprocessWsPendingCmds). README documents the JSON rate forms (rate_hz,hz,interval_ms) that were accepted but undocumented.Multi-client (7a3b51c)
cleanupClients()now uses the library default (DEFAULT_MAX_WS_CLIENTS= 8 on ESP32).WS_EVT_DISCONNECTonly resets the shared rate/events state whenserver->count() == 0(last client out) — otherwise one client leaving would wipe the other's negotiated state.Verified on device
status/events/tare/timer/led/display/low_power/sleep/ rate forms / unknown-input handling / soft-sleep round trip).🤖 Generated with Claude Code