Skip to content

Fix dump-c bitfield truncation in compound assignments#8881

Open
tautschnig wants to merge 1 commit intodiffblue:developfrom
tautschnig:fix-7229-checksum-mismatch
Open

Fix dump-c bitfield truncation in compound assignments#8881
tautschnig wants to merge 1 commit intodiffblue:developfrom
tautschnig:fix-7229-checksum-mismatch

Conversation

@tautschnig
Copy link
Collaborator

When goto-instrument --dump-c converts a goto program back to C code, compound assignments to bitfields (e.g., s.f &= 8 where f is a 3-bit signed bitfield) were incorrectly handled. The goto program correctly inserts a cast to c_bit_field to truncate the intermediate result to the bitfield width, but goto_program2code.cpp was stripping this cast entirely, causing the pre-truncation 32-bit value to be used in subsequent boolean contexts.

Example: s.f = -1 (3-bit: 111), then (s.f &= 8) || 0

  • Correct: -1 sign-extended & 8 = 8, truncated to 3 bits = 0, 0||0 = 0
  • Bug: used pre-truncation value 8 (non-zero), so 8||0 = 1

The fix replaces the c_bit_field cast with an explicit bitwise AND mask that truncates to the bitfield width, preserving the zero/non-zero semantics needed for boolean contexts.

Minimal reproducer:
struct S { signed f : 3; };
static struct S s = {-1};
int main(void) { int x = (s.f &= 8) || 0; assert(x == 0); }

This was found via CSmith test seed 1684847330 (issue #7729).

Fixes: #7729

  • Each commit message has a non-empty body, explaining why the change was made.
  • n/a Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • n/a The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • n/a My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

When goto-instrument --dump-c converts a goto program back to C code,
compound assignments to bitfields (e.g., s.f &= 8 where f is a 3-bit
signed bitfield) were incorrectly handled. The goto program correctly
inserts a cast to c_bit_field to truncate the intermediate result to
the bitfield width, but goto_program2code.cpp was stripping this cast
entirely, causing the pre-truncation 32-bit value to be used in
subsequent boolean contexts.

Example: s.f = -1 (3-bit: 111), then (s.f &= 8) || 0
- Correct: -1 sign-extended & 8 = 8, truncated to 3 bits = 0, 0||0 = 0
- Bug: used pre-truncation value 8 (non-zero), so 8||0 = 1

The fix replaces the c_bit_field cast with an explicit bitwise AND mask
that truncates to the bitfield width, preserving the zero/non-zero
semantics needed for boolean contexts.

Minimal reproducer:
  struct S { signed f : 3; };
  static struct S s = {-1};
  int main(void) { int x = (s.f &= 8) || 0; assert(x == 0); }

This was found via CSmith test seed 1684847330 (issue diffblue#7729).

Fixes: diffblue#7729

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
@tautschnig tautschnig self-assigned this Mar 18, 2026
Copilot AI review requested due to automatic review settings March 18, 2026 12:21
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves --dump-c output correctness around C bit-field truncation by changing how c_bit_field typecasts are simplified during C code regeneration, and adds a regression test to catch a previously incorrect boolean-context evaluation.

Changes:

  • Update goto_program2code.cpp to avoid dropping casts to c_bit_field, replacing them with explicit masking.
  • Add a new goto-instrument --dump-c regression test covering compound assignment to a signed bit-field used under ||.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/goto-instrument/goto_program2code.cpp Changes cleanup_expr handling of casts to ID_c_bit_field to preserve truncation semantics in dumped C.
regression/goto-instrument/dump-c-bitfield-truncation/test.desc Adds a new regression driver for --dump-c + recompile + CBMC verification.
regression/goto-instrument/dump-c-bitfield-truncation/main.c Introduces a minimal reproducer using `(s.f &= 8)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1948 to +1974
const bool is_signed = bf_type.underlying_type().id() == ID_signedbv;

exprt op = to_typecast_expr(expr).op();
const typet op_type = op.type();

if(width == 0)
{
expr = from_integer(0, op_type);
}
else if(is_signed)
{
// For signed: mask to width bits, then sign-extend by casting
// through an unsigned type and subtracting the sign bit weight.
// Equivalent to: ((unsigned)(op) & mask) as signed, with sign
// extension. Use: (op & mask) - ((op & sign_bit) ? (1<<width) : 0)
// Simpler: just mask and let the assignment to the bitfield handle
// sign extension. But we need the VALUE to be correct for boolean
// checks. The simplest correct approach: mask the lower N bits.
// If the result is used in a boolean context, only zero/non-zero
// matters, and masking preserves that correctly.
mp_integer mask = power(mp_integer(2), mp_integer(width)) - 1;
expr = bitand_exprt(op, from_integer(mask, op_type));
}
else
{
mp_integer mask = power(mp_integer(2), mp_integer(width)) - 1;
expr = bitand_exprt(op, from_integer(mask, op_type));
Comment on lines +1959 to +1969
// For signed: mask to width bits, then sign-extend by casting
// through an unsigned type and subtracting the sign bit weight.
// Equivalent to: ((unsigned)(op) & mask) as signed, with sign
// extension. Use: (op & mask) - ((op & sign_bit) ? (1<<width) : 0)
// Simpler: just mask and let the assignment to the bitfield handle
// sign extension. But we need the VALUE to be correct for boolean
// checks. The simplest correct approach: mask the lower N bits.
// If the result is used in a boolean context, only zero/non-zero
// matters, and masking preserves that correctly.
mp_integer mask = power(mp_integer(2), mp_integer(width)) - 1;
expr = bitand_exprt(op, from_integer(mask, op_type));
@codecov
Copy link

codecov bot commented Mar 18, 2026

Codecov Report

❌ Patch coverage is 75.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.41%. Comparing base (85c204f) to head (e185747).

Files with missing lines Patch % Lines
src/goto-instrument/goto_program2code.cpp 75.00% 3 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #8881   +/-   ##
========================================
  Coverage    80.41%   80.41%           
========================================
  Files         1703     1703           
  Lines       188398   188409   +11     
  Branches        73       73           
========================================
+ Hits        151498   151510   +12     
+ Misses       36900    36899    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CSmith test with random seed 1684847330 has dump-c checksum mismatch

2 participants