[ValueTracking] Fix computeKnownFPClass handling of nsz#186315
[ValueTracking] Fix computeKnownFPClass handling of nsz#186315cardigan1008 wants to merge 15 commits intollvm:mainfrom
Conversation
|
@llvm/pr-subscribers-llvm-transforms Author: Yunbo Ni (cardigan1008) ChangesFixes #185920 This patch improves When Full diff: https://github.com/llvm/llvm-project/pull/186315.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 0672dc889b640..25a8a8166d3af 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -5588,7 +5588,10 @@ void computeKnownFPClass(const Value *V, const APInt &DemandedElts,
DenormalMode Mode =
F ? F->getDenormalMode(FltSem) : DenormalMode::getDynamic();
- if (Self && Opc == Instruction::FAdd) {
+ const FPMathOperator *FPop = cast<FPMathOperator>(Op);
+ bool HasNSZ = FPop->hasNoSignedZeros();
+
+ if (!HasNSZ && Self && Opc == Instruction::FAdd) {
Known = KnownFPClass::fadd_self(KnownLHS, Mode);
} else {
// RHS is canonically cheaper to compute. Skip inspecting the LHS if
diff --git a/llvm/test/Transforms/Attributor/nofpclass.ll b/llvm/test/Transforms/Attributor/nofpclass.ll
index 507d0f3ff98f3..3946c9f88e41d 100644
--- a/llvm/test/Transforms/Attributor/nofpclass.ll
+++ b/llvm/test/Transforms/Attributor/nofpclass.ll
@@ -3326,6 +3326,17 @@ define float @fadd_double_known_negative_nonsub_dynamic(float noundef nofpclass(
ret float %add
}
+define float @fadd_double_known_negative_zero_nsz(float noundef nofpclass(ninf pzero sub nnorm) %arg) {
+; CHECK: Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
+; CHECK-LABEL: define noundef nofpclass(ninf nsub nnorm) float @fadd_double_known_negative_zero_nsz
+; CHECK-SAME: (float noundef nofpclass(ninf pzero sub nnorm) [[ARG:%.*]]) #[[ATTR3]] {
+; CHECK-NEXT: [[ADD:%.*]] = fadd nsz float [[ARG]], [[ARG]]
+; CHECK-NEXT: ret float [[ADD]]
+;
+ %add = fadd nsz float %arg, %arg
+ ret float %add
+}
+
define float @fsub_self(float noundef %arg) {
; CHECK: Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
; CHECK-LABEL: define noundef float @fsub_self
|
|
@llvm/pr-subscribers-llvm-analysis Author: Yunbo Ni (cardigan1008) ChangesFixes #185920 This patch improves When Full diff: https://github.com/llvm/llvm-project/pull/186315.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 0672dc889b640..25a8a8166d3af 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -5588,7 +5588,10 @@ void computeKnownFPClass(const Value *V, const APInt &DemandedElts,
DenormalMode Mode =
F ? F->getDenormalMode(FltSem) : DenormalMode::getDynamic();
- if (Self && Opc == Instruction::FAdd) {
+ const FPMathOperator *FPop = cast<FPMathOperator>(Op);
+ bool HasNSZ = FPop->hasNoSignedZeros();
+
+ if (!HasNSZ && Self && Opc == Instruction::FAdd) {
Known = KnownFPClass::fadd_self(KnownLHS, Mode);
} else {
// RHS is canonically cheaper to compute. Skip inspecting the LHS if
diff --git a/llvm/test/Transforms/Attributor/nofpclass.ll b/llvm/test/Transforms/Attributor/nofpclass.ll
index 507d0f3ff98f3..3946c9f88e41d 100644
--- a/llvm/test/Transforms/Attributor/nofpclass.ll
+++ b/llvm/test/Transforms/Attributor/nofpclass.ll
@@ -3326,6 +3326,17 @@ define float @fadd_double_known_negative_nonsub_dynamic(float noundef nofpclass(
ret float %add
}
+define float @fadd_double_known_negative_zero_nsz(float noundef nofpclass(ninf pzero sub nnorm) %arg) {
+; CHECK: Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
+; CHECK-LABEL: define noundef nofpclass(ninf nsub nnorm) float @fadd_double_known_negative_zero_nsz
+; CHECK-SAME: (float noundef nofpclass(ninf pzero sub nnorm) [[ARG:%.*]]) #[[ATTR3]] {
+; CHECK-NEXT: [[ADD:%.*]] = fadd nsz float [[ARG]], [[ARG]]
+; CHECK-NEXT: ret float [[ADD]]
+;
+ %add = fadd nsz float %arg, %arg
+ ret float %add
+}
+
define float @fsub_self(float noundef %arg) {
; CHECK: Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
; CHECK-LABEL: define noundef float @fsub_self
|
dtcxzyw
left a comment
There was a problem hiding this comment.
It is a fragile way to fix the issue. nsz is always droppable.
I don't object to the fix. IIRC we have used similar workarounds for nsz in the tree.
nikic
left a comment
There was a problem hiding this comment.
Hm... shouldn't we handle nsz generically somehow? If nsz is set, then for all inputs we need to treat treat a pzero known FP class as also including nzero and vice versa.
|
Title is misleading? This is fixing a correctness problem, not making the analysis result better? |
Updated the title and description. |
Thanks for the suggestion. I've updated the fix to widen the zero with nsz set. |
🐧 Linux x64 Test Results
✅ The build succeeded and all tests passed. |
🪟 Windows x64 Test Results
✅ The build succeeded and all tests passed. |
| // With no-signed-zeros semantics, +0 and -0 are interchangeable. | ||
| // If the operation has nsz and only one sign of zero is possible in the | ||
| // result, the other must also be considered possible. | ||
| if (const auto *FPOp = dyn_cast_or_null<FPMathOperator>(Op)) { |
There was a problem hiding this comment.
This check was already performed above, can you do the flag check in the same place?
There was a problem hiding this comment.
It will be late as the existing flag check is before fadd_self inference.
| switch (II->getIntrinsicID()) { | ||
| case Intrinsic::fabs: | ||
| case Intrinsic::copysign: | ||
| return false; |
There was a problem hiding this comment.
The safer direction would be to list the cases where it is OK. You also already have the switch over all the opcodes directly in computeKnownFPClass. Can you keep this contained in computeKnownFPClass without a second set of opcode switch logic
|
Ping. |
Fixes #151303
Fixes #185920
This patch fixes incorrect
nofpclassinference in ValueTracking for floating-point operations withnsz.With
nsz, +0.0 and -0.0 are interchangeable for the current operation. UpdatecomputeKnownFPClassto expand a known single zero sign to include bothpzeroandnzeroundernsz, so laternofpclassinference does not preserve signed-zero-specific facts.