NUT swarm stress testing: revise cleanup of disconnected sockets

> In testing I also saw something concerning:
> * in at least Linux, clients (`upslog` here) were apparently piling up and exceeding the `MAXCONN` value embedded into the generated `tests/NIT/tmp/etc/upsd.conf` file (two swarm sizes plus 30 to feel generous). Maybe they were retrying connections as their earlier attempts failed (due to initial bug, may still be possible when `upsd` is overwhelmed), compounding the issue: `upsd` remembers older connection attempts, earlier in its list, and until they time out, it has little chance of seeing newer ones. Not sure OTOH if "connection closed/timed out" is evaluated at that time, or it is only fast-tracked for drivers.

 _Originally posted by @jimklimov in [#3302](https://github.com/networkupstools/nut/issues/3302#issuecomment-4110488729)_

Upon a quick look at `upsd.c` `mainloop()`, there are checks for FD validity when adding them or not to the list of sockets to check for activity. This seems to be a "static" check based on a previously known disconnection, and so a saved "invalid" FD/handle value. Clients are not looked at after 60 seconds of inactivity (hard-coded).

Maybe the fixes done for issue #3302 and/or #3365 would cover this more timely detection of disconnections well enough to make the point moot, but if not (so such "leak" or piling-up remains), keep this in mind. Maybe detection of disconnections should be more aggressive somehow.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NUT swarm stress testing: revise cleanup of disconnected sockets #3366

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

NUT swarm stress testing: revise cleanup of disconnected sockets #3366

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions