apc_modbus: increase default TCP response timeout to 2000 ms#3418
Open
EchterAgo wants to merge 1 commit intonetworkupstools:masterfrom
Open
apc_modbus: increase default TCP response timeout to 2000 ms#3418EchterAgo wants to merge 1 commit intonetworkupstools:masterfrom
EchterAgo wants to merge 1 commit intonetworkupstools:masterfrom
Conversation
|
A ZIP file with standard source tarball and another tarball with pre-built docs for commit 3820693 is temporarily available: NUT-tarballs-PR-3418.zip. |
The default 500ms timeout value is marginal for Modbus TCP, even on a fast local network. In tests with an APC UPS, we see timeouts because of this: ``` Apr 14 19:10:44 ares apc_modbus[2002143]: _apc_modbus_read_registers: Read of 0:27 failed: Connection timed out (192.168.0.102:502) Apr 14 19:10:44 ares apc_modbus[2002143]: _apc_modbus_read_registers: Read of 0:27 failed: Invalid data (192.168.0.102:502) Apr 14 19:10:44 ares apc_modbus[2002143]: _apc_modbus_close: Closing connection Apr 14 19:10:47 ares apc_modbus[2002143]: Opened modbus successfully ``` Packet capture: ``` No. Time Source Destination Protocol Length Info 3636 2026-04-14 19:10:43.842072 192.168.0.40 192.168.0.102 Modbus/TCP 66 Query: Trans: 12622; Unit: 1, Func: 3: Read Holding Registers 3637 2026-04-14 19:10:44.018369 192.168.0.102 192.168.0.40 TCP 60 502 → 40392 [ACK] Seq=70441 Ack=11905 Win=3660 Len=0 3638 2026-04-14 19:10:44.377723 192.168.0.40 192.168.0.102 Modbus/TCP 66 Query: Trans: 12623; Unit: 1, Func: 3: Read Holding Registers 3639 2026-04-14 19:10:44.578103 192.168.0.102 192.168.0.40 TCP 60 502 → 40392 [ACK] Seq=70441 Ack=11917 Win=3648 Len=0 3640 2026-04-14 19:10:44.579449 192.168.0.102 192.168.0.40 Modbus/TCP 117 Response: Trans: 12622; Unit: 1, Func: 3: Read Holding Registers 3641 2026-04-14 19:10:44.579482 192.168.0.40 192.168.0.102 TCP 54 40392 → 502 [ACK] Seq=11917 Ack=70504 Win=63 Len=0 3642 2026-04-14 19:10:44.579604 192.168.0.40 192.168.0.102 TCP 54 40392 → 502 [FIN, ACK] Seq=11917 Ack=70504 Win=63 Len=0 ``` 3636 is the first read, 3638 is the retry after 500ms without response, both are ACKed. 3640 is the actual response to transaction 12622, which arrives 700ms after the initial packet. At this point we expect a response to transaction 12623, which is why we fail with "invalid data". To fix this we increase the default TCP response timeout to 2000 ms, which seems to be more reasonable for a device that already can take 700 ms to respond under ideal conditions. I chose a bit higher value for users who might connect the device over a higher latency network. Signed-off-by: Axel Gembe <axel@gembe.net>
b3dc878 to
3820693
Compare
Contributor
Author
|
Tested, ready for review! |
Contributor
Author
|
This seems to have improved connection reliability a lot. |
22 tasks
|
✅ Build nut 2.8.5.4574-master completed (commit d9de3438ea by @EchterAgo)
|
|
✅ Build nut 2.8.5.4575-master completed (commit 06fcca4c8c by @EchterAgo)
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The default 500ms timeout value is marginal for Modbus TCP, even on a fast local network. In tests with an APC UPS, we see timeouts because of this:
Packet capture:
3636 is the first read, 3638 is the retry after 500ms without response, both are ACKed. 3640 is the actual response to transaction 12622, which arrives 700ms after the initial packet. At this point we expect a response to transaction 12623, which is why we fail with "invalid data".
To fix this we increase the default TCP response timeout to 2000 ms, which seems to be more reasonable for a device that already can take 700 ms to respond under ideal conditions. I chose a bit higher value for users who might connect the device over a higher latency network.
General points
Described the changes in the PR submission or a separate issue, e.g.
known published or discovered protocols, applicable hardware (expected
compatible and actually tested/developed against), limitations, etc.
There may be multiple commits in the PR, aligned and commented with
a functional change. Notably, coding style changes better belong in a
separate PR, but certainly in a dedicated commit to simplify reviews
of "real" changes in the other commits. Similarly for typo fixes in
comments or text documents.
Use of coding helper tools and AI should be disclosed in the commit
or PR comments (it is interesting to know which ones do a decent job).
As with other contributions, a human is responsible and thanked for the
quality and content of the change, and is presumed to have the right to
post that code to be published further under the project's license terms.
Please star NUT on GitHub, this helps with sponsorships! ;)
Frequent "underwater rocks" for driver addition/update PRs
Revised existing driver families and added a sub-driver if applicable
(
nutdrv_qx,usbhid-ups...) or added a brand new driver in the othercase.
Did not extend obsoleted drivers with new hardware support features
(notably
blazerand other single-device family drivers for Qx protocols,except the new
nutdrv_qxwhich should cover them all).For updated existing device drivers, bumped the
DRIVER_VERSIONmacroor its equivalent.
For USB devices (HID or not), revised that the driver uses unique
VID/PID combinations, or raised discussions when this is not the case
(several vendors do use same interface chips for unrelated protocols).
For new USB devices, built and committed the changes for the
scripts/upower/95-upower-hid.hwdbfileProposed NUT data mapping is aligned with existing
docs/nut-names.txtfile. If the device exposes useful data points not listed in the file, the
experimental.*namespace can be used as documented there, and discussionshould be raised on the NUT Developers mailing list to standardize the new
concept.
Updated
data/driver.list.inif applicable (new tested device info)Frequent "underwater rocks" for general C code PRs
structure layout and alignment in memory, endianness (layout of bytes and
bits in memory for multi-byte numeric types), or use of generic
intwherelanguage or libraries dictate the use of
size_t(orssize_tsometimes).Progress and errors are handled with
upsdebugx(),upslogx(),fatalx()and related methods, not with directprintf()orexit().Similarly, NUT helpers are used for error-checked memory allocation and
string operations (except where customized error handling is needed,
such as unlocking device ports, etc.)
Coding style (including whitespace for indentations) follows precedent
in the code of the file, and examples/guide in
docs/developers.txtfile.For newly added files, the
Makefile.amrecipes were updated and themake distchecktarget passes.General documentation updates
Added a bullet point into
NEWS.adoc, possibly alsoUPGRADING.adocif there is something packagers or custom-build users should take into
account (new driver categories, configuration options, dependencies...)
Updated
docs/acknowledgements.txt(for vendor-backed device support)Added or updated manual page information in
docs/man/*.txtfilesand corresponding recipe lists in
docs/man/Makefile.amfor new pagesPassed
make spellcheck, updated spell-checking dictionary in thedocs/nut.dictfile if needed (did not remove any words -- themakerule printout in case of changes suggests how to maintain it).
Additional work may be needed after posting this PR
Propose a PR for NUT DDL with detailed device data dumps from tests
against real hardware (the more models, the better).
Address NUT CI farm build failures for the PR: testing on numerous
platforms and toolkits can expose issues not seen on just one system.
the changed codebase.