Fix a set of AddressSanitizer and ThreadSanitizer issues#1889
Fix a set of AddressSanitizer and ThreadSanitizer issues#1889
Conversation
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
| } | ||
| done: | ||
| // Lifetime synchronization, no need to build happens-before relation | ||
| spin_wait_while_eq( s.my_going, 2U, std::memory_order_relaxed ); |
There was a problem hiding this comment.
This change probably can have significant performance impact. Are we having a counterexample showing that change from relaxed to acquire is really needed?
There was a problem hiding this comment.
To be fair, it is the change I am the least confident with. Acquire memory order was there before but later it was changed. The change allowed me to silence one weird TSan's warning about data race but now I can't reproduce it for some reason.
Note that for x86 this change shouldn't have any performance impact: acquire memory order (nor acq_rel) won't introduce any new CPU instructions or change existing ones.
There was a problem hiding this comment.
Note that for x86 this change shouldn't have any performance impact: acquire memory order (nor acq_rel) won't introduce any new CPU instructions or change existing ones.
That is worrying me here, probably we may worsen the life on machines with relaxed memory model, while can't estimate the performance impact, as don't have such machines in the benchmarking pool.
In an ideal world, we may want to have a description what's going on here in term of fences and why acquire is required.
There was a problem hiding this comment.
That is worrying me here, probably we may worsen the life on machines with relaxed memory model, while can't estimate the performance impact, as don't have such machines in the benchmarking pool.
Well, considering TSan's reports, it could be possible, that on systems with relaxed memory model these are real races and potential slowdowns are justified to improve correctness.
In an ideal world, we may want to have a description what's going on here in term of fences and why acquire is required.
Right, but I am not very familiar with this implementation and it seems too complicated to find an explaining counterexample, to be honest.
There was a problem hiding this comment.
Not very familiar myself, but this is a try_release and this line falls inside of the release for read (line 355), so there can be multiple threads in this area of code at the same, releasing read locks. So, the store of 1 to my_going at line 402 by a thread A could allow another thread B spinning at line 409 to proceed, which then calls s.initialize(). initialize writes to prev (as an example) which is also written to at line 401. So I think its possible for either A's or B's write to be the last one.
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Co-authored-by: Ilya Isaev <ilya.isaev@intel.com>
Description
Add a comprehensive description of proposed changes
Fixes # - issue number(s) if exists
Type of change
Choose one or multiple, leave empty if none of the other choices apply
Add a respective label(s) to PR if you have permissions
Tests
Documentation
Breaks backward compatibility
Notify the following users
List users with
@to send notificationsOther information