Skip to content

apc_modbus: increase default TCP response timeout to 2000 ms#3418

Open
EchterAgo wants to merge 1 commit intonetworkupstools:masterfrom
EchterAgo:apc_modbus_increase_tcp_timeout
Open

apc_modbus: increase default TCP response timeout to 2000 ms#3418
EchterAgo wants to merge 1 commit intonetworkupstools:masterfrom
EchterAgo:apc_modbus_increase_tcp_timeout

Conversation

@EchterAgo
Copy link
Copy Markdown
Contributor

@EchterAgo EchterAgo commented Apr 14, 2026

The default 500ms timeout value is marginal for Modbus TCP, even on a fast local network. In tests with an APC UPS, we see timeouts because of this:

Apr 14 19:10:44 ares apc_modbus[2002143]: _apc_modbus_read_registers: Read of 0:27 failed: Connection timed out (192.168.0.102:502)
Apr 14 19:10:44 ares apc_modbus[2002143]: _apc_modbus_read_registers: Read of 0:27 failed: Invalid data (192.168.0.102:502)
Apr 14 19:10:44 ares apc_modbus[2002143]: _apc_modbus_close: Closing connection
Apr 14 19:10:47 ares apc_modbus[2002143]: Opened modbus successfully

Packet capture:

No.	Time	Source	Destination	Protocol	Length	Info
3636	2026-04-14 19:10:43.842072	192.168.0.40	192.168.0.102	Modbus/TCP	66	   Query: Trans: 12622; Unit:   1, Func:   3: Read Holding Registers
3637	2026-04-14 19:10:44.018369	192.168.0.102	192.168.0.40	TCP	60	502 → 40392 [ACK] Seq=70441 Ack=11905 Win=3660 Len=0
3638	2026-04-14 19:10:44.377723	192.168.0.40	192.168.0.102	Modbus/TCP	66	   Query: Trans: 12623; Unit:   1, Func:   3: Read Holding Registers
3639	2026-04-14 19:10:44.578103	192.168.0.102	192.168.0.40	TCP	60	502 → 40392 [ACK] Seq=70441 Ack=11917 Win=3648 Len=0
3640	2026-04-14 19:10:44.579449	192.168.0.102	192.168.0.40	Modbus/TCP	117	Response: Trans: 12622; Unit:   1, Func:   3: Read Holding Registers
3641	2026-04-14 19:10:44.579482	192.168.0.40	192.168.0.102	TCP	54	40392 → 502 [ACK] Seq=11917 Ack=70504 Win=63 Len=0
3642	2026-04-14 19:10:44.579604	192.168.0.40	192.168.0.102	TCP	54	40392 → 502 [FIN, ACK] Seq=11917 Ack=70504 Win=63 Len=0

3636 is the first read, 3638 is the retry after 500ms without response, both are ACKed. 3640 is the actual response to transaction 12622, which arrives 700ms after the initial packet. At this point we expect a response to transaction 12623, which is why we fail with "invalid data".

To fix this we increase the default TCP response timeout to 2000 ms, which seems to be more reasonable for a device that already can take 700 ms to respond under ideal conditions. I chose a bit higher value for users who might connect the device over a higher latency network.

General points

  • Described the changes in the PR submission or a separate issue, e.g.
    known published or discovered protocols, applicable hardware (expected
    compatible and actually tested/developed against), limitations, etc.

  • There may be multiple commits in the PR, aligned and commented with
    a functional change. Notably, coding style changes better belong in a
    separate PR, but certainly in a dedicated commit to simplify reviews
    of "real" changes in the other commits. Similarly for typo fixes in
    comments or text documents.

  • Use of coding helper tools and AI should be disclosed in the commit
    or PR comments (it is interesting to know which ones do a decent job).
    As with other contributions, a human is responsible and thanked for the
    quality and content of the change, and is presumed to have the right to
    post that code to be published further under the project's license terms.

  • Please star NUT on GitHub, this helps with sponsorships! ;)

Frequent "underwater rocks" for driver addition/update PRs

  • Revised existing driver families and added a sub-driver if applicable
    (nutdrv_qx, usbhid-ups...) or added a brand new driver in the other
    case.

  • Did not extend obsoleted drivers with new hardware support features
    (notably blazer and other single-device family drivers for Qx protocols,
    except the new nutdrv_qx which should cover them all).

  • For updated existing device drivers, bumped the DRIVER_VERSION macro
    or its equivalent.

  • For USB devices (HID or not), revised that the driver uses unique
    VID/PID combinations, or raised discussions when this is not the case
    (several vendors do use same interface chips for unrelated protocols).

  • For new USB devices, built and committed the changes for the
    scripts/upower/95-upower-hid.hwdb file

  • Proposed NUT data mapping is aligned with existing docs/nut-names.txt
    file. If the device exposes useful data points not listed in the file, the
    experimental.* namespace can be used as documented there, and discussion
    should be raised on the NUT Developers mailing list to standardize the new
    concept.

  • Updated data/driver.list.in if applicable (new tested device info)

Frequent "underwater rocks" for general C code PRs

  • Did not "blindly assume" default integer type sizes and value ranges,
    structure layout and alignment in memory, endianness (layout of bytes and
    bits in memory for multi-byte numeric types), or use of generic int where
    language or libraries dictate the use of size_t (or ssize_t sometimes).
  • Progress and errors are handled with upsdebugx(), upslogx(),
    fatalx() and related methods, not with direct printf() or exit().
    Similarly, NUT helpers are used for error-checked memory allocation and
    string operations (except where customized error handling is needed,
    such as unlocking device ports, etc.)

  • Coding style (including whitespace for indentations) follows precedent
    in the code of the file, and examples/guide in docs/developers.txt file.

  • For newly added files, the Makefile.am recipes were updated and the
    make distcheck target passes.

General documentation updates

  • Added a bullet point into NEWS.adoc, possibly also UPGRADING.adoc
    if there is something packagers or custom-build users should take into
    account (new driver categories, configuration options, dependencies...)

  • Updated docs/acknowledgements.txt (for vendor-backed device support)

  • Added or updated manual page information in docs/man/*.txt files
    and corresponding recipe lists in docs/man/Makefile.am for new pages

  • Passed make spellcheck, updated spell-checking dictionary in the
    docs/nut.dict file if needed (did not remove any words -- the make
    rule printout in case of changes suggests how to maintain it).

Additional work may be needed after posting this PR

  • Propose a PR for NUT DDL with detailed device data dumps from tests
    against real hardware (the more models, the better).

  • Address NUT CI farm build failures for the PR: testing on numerous
    platforms and toolkits can expose issues not seen on just one system.

  • Revise suggestions from LGTM.COM analysis about "new issues" with
    the changed codebase.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 14, 2026

A ZIP file with standard source tarball and another tarball with pre-built docs for commit 3820693 is temporarily available: NUT-tarballs-PR-3418.zip.

The default 500ms timeout value is marginal for Modbus TCP, even on a
fast local network. In tests with an APC UPS, we see timeouts because of
this:

```
Apr 14 19:10:44 ares apc_modbus[2002143]: _apc_modbus_read_registers: Read of 0:27 failed: Connection timed out (192.168.0.102:502)
Apr 14 19:10:44 ares apc_modbus[2002143]: _apc_modbus_read_registers: Read of 0:27 failed: Invalid data (192.168.0.102:502)
Apr 14 19:10:44 ares apc_modbus[2002143]: _apc_modbus_close: Closing connection
Apr 14 19:10:47 ares apc_modbus[2002143]: Opened modbus successfully
```

Packet capture:

```
No.	Time	Source	Destination	Protocol	Length	Info
3636	2026-04-14 19:10:43.842072	192.168.0.40	192.168.0.102	Modbus/TCP	66	   Query: Trans: 12622; Unit:   1, Func:   3: Read Holding Registers
3637	2026-04-14 19:10:44.018369	192.168.0.102	192.168.0.40	TCP	60	502 → 40392 [ACK] Seq=70441 Ack=11905 Win=3660 Len=0
3638	2026-04-14 19:10:44.377723	192.168.0.40	192.168.0.102	Modbus/TCP	66	   Query: Trans: 12623; Unit:   1, Func:   3: Read Holding Registers
3639	2026-04-14 19:10:44.578103	192.168.0.102	192.168.0.40	TCP	60	502 → 40392 [ACK] Seq=70441 Ack=11917 Win=3648 Len=0
3640	2026-04-14 19:10:44.579449	192.168.0.102	192.168.0.40	Modbus/TCP	117	Response: Trans: 12622; Unit:   1, Func:   3: Read Holding Registers
3641	2026-04-14 19:10:44.579482	192.168.0.40	192.168.0.102	TCP	54	40392 → 502 [ACK] Seq=11917 Ack=70504 Win=63 Len=0
3642	2026-04-14 19:10:44.579604	192.168.0.40	192.168.0.102	TCP	54	40392 → 502 [FIN, ACK] Seq=11917 Ack=70504 Win=63 Len=0
```

3636 is the first read, 3638 is the retry after 500ms without response,
both are ACKed. 3640 is the actual response to transaction 12622, which
arrives 700ms after the initial packet. At this point we expect a
response to transaction 12623, which is why we fail with "invalid
data".

To fix this we increase the default TCP response timeout to 2000 ms,
which seems to be more reasonable for a device that already can take
700 ms to respond under ideal conditions. I chose a bit higher value for
users who might connect the device over a higher latency network.

Signed-off-by: Axel Gembe <axel@gembe.net>
@EchterAgo EchterAgo force-pushed the apc_modbus_increase_tcp_timeout branch from b3dc878 to 3820693 Compare April 14, 2026 12:47
@EchterAgo EchterAgo marked this pull request as ready for review April 14, 2026 12:56
@EchterAgo
Copy link
Copy Markdown
Contributor Author

Tested, ready for review!

@EchterAgo
Copy link
Copy Markdown
Contributor Author

This seems to have improved connection reliability a lot.

@AppVeyorBot
Copy link
Copy Markdown

Build nut 2.8.5.4574-master completed (commit d9de3438ea by @EchterAgo)

@AppVeyorBot
Copy link
Copy Markdown

Build nut 2.8.5.4575-master completed (commit 06fcca4c8c by @EchterAgo)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants