Skip to content

Fix a set of AddressSanitizer and ThreadSanitizer issues#1889

Open
isaevil wants to merge 13 commits intomasterfrom
dev/isaevil/sanitizers
Open

Fix a set of AddressSanitizer and ThreadSanitizer issues#1889
isaevil wants to merge 13 commits intomasterfrom
dev/isaevil/sanitizers

Conversation

@isaevil
Copy link
Contributor

@isaevil isaevil commented Oct 29, 2025

Description

Add a comprehensive description of proposed changes

Fixes # - issue number(s) if exists

Type of change

Choose one or multiple, leave empty if none of the other choices apply

Add a respective label(s) to PR if you have permissions

  • bug fix - change that fixes an issue
  • new feature - change that adds functionality
  • tests - change in tests
  • infrastructure - change in infrastructure and CI
  • documentation - documentation update

Tests

  • added - required for new features and some bug fixes
  • not needed

Documentation

  • updated in # - add PR number
  • needs to be updated
  • not needed

Breaks backward compatibility

  • Yes
  • No
  • Unknown

Notify the following users

List users with @ to send notifications

Other information

Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
@isaevil isaevil changed the title [WIP] Fix a set of AddressSanitizer and ThreadSanitizer issues Fix a set of AddressSanitizer and ThreadSanitizer issues Nov 4, 2025
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
@isaevil isaevil marked this pull request as ready for review November 17, 2025 14:55
}
done:
// Lifetime synchronization, no need to build happens-before relation
spin_wait_while_eq( s.my_going, 2U, std::memory_order_relaxed );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change probably can have significant performance impact. Are we having a counterexample showing that change from relaxed to acquire is really needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, it is the change I am the least confident with. Acquire memory order was there before but later it was changed. The change allowed me to silence one weird TSan's warning about data race but now I can't reproduce it for some reason.
Note that for x86 this change shouldn't have any performance impact: acquire memory order (nor acq_rel) won't introduce any new CPU instructions or change existing ones.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that for x86 this change shouldn't have any performance impact: acquire memory order (nor acq_rel) won't introduce any new CPU instructions or change existing ones.

That is worrying me here, probably we may worsen the life on machines with relaxed memory model, while can't estimate the performance impact, as don't have such machines in the benchmarking pool.

In an ideal world, we may want to have a description what's going on here in term of fences and why acquire is required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is worrying me here, probably we may worsen the life on machines with relaxed memory model, while can't estimate the performance impact, as don't have such machines in the benchmarking pool.

Well, considering TSan's reports, it could be possible, that on systems with relaxed memory model these are real races and potential slowdowns are justified to improve correctness.

In an ideal world, we may want to have a description what's going on here in term of fences and why acquire is required.

Right, but I am not very familiar with this implementation and it seems too complicated to find an explaining counterexample, to be honest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very familiar myself, but this is a try_release and this line falls inside of the release for read (line 355), so there can be multiple threads in this area of code at the same, releasing read locks. So, the store of 1 to my_going at line 402 by a thread A could allow another thread B spinning at line 409 to proceed, which then calls s.initialize(). initialize writes to prev (as an example) which is also written to at line 401. So I think its possible for either A's or B's write to be the last one.

isaevil added 2 commits March 9, 2026 10:09
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
@omalyshe omalyshe added this to the 2023.0.0 milestone Mar 10, 2026
isaevil added 2 commits March 11, 2026 14:20
Co-authored-by: Ilya Isaev <ilya.isaev@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants