Skip to content

feat(riscv): i64 Phase 2 — mul, shifts, rotates, compares, clz/ctz/popcnt, sign-extends#128

Open
avrabe wants to merge 1 commit into
mainfrom
feat/riscv-i64-phase2
Open

feat(riscv): i64 Phase 2 — mul, shifts, rotates, compares, clz/ctz/popcnt, sign-extends#128
avrabe wants to merge 1 commit into
mainfrom
feat/riscv-i64-phase2

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented May 21, 2026

Summary

Phase 2 of RV32IMAC i64 support, building on Phase 1 (v0.3.1, #119). 16 new i64 ops, all lowering to RV32IMAC base-ISA instruction sequences via the typed VstackVal register-pair model.

Implemented (16 ops)

  • I64Mul — low-64 product. lo = mul(al,bl); hi = mulhu(al,bl) + mul(al,bh) + mul(ah,bl).
  • I64Shl / I64ShrS / I64ShrU — runtime-amount shifts with a data-dependent bne on bit 5 of shamt & 63. Cross-half carry uses a two-step (lo>>1)>>(31-s) so it stays well-defined at s==0 (avoids RV's >>32 == >>0 masking).
  • I64Rotl / I64Rotr — composed from two cross-word shifts ORed half-by-half.
  • I64Clz / I64Ctz / I64Popcnt — base-ISA software (no Zbb). clz branches hi-vs-lo; clz_word is an unrolled branchless binary search. ctz via 31 - clz(x & -x). popcnt is SWAR per half.
  • I64{Lt,Le,Gt,Ge}{S,U} — hi-then-lo comparison ladder; Le/Gt/Ge derived from Lt by operand swap + invert.
  • I64Extend8S/16S/32S(x << (32-w)) >>s (32-w) then srai lo,31 to broadcast the sign.

Deferred to Phase 3

I64DivS/DivU/RemS/RemU — RV32 has no 64-bit divide; an inline __divdi3-style long division would balloon the selector and there's no runtime-call path yet. They fall through to the existing SelectorError::Unsupported arm (a test pins this).

Tests

+23 tests (125 → 148 passing, 1 pre-existing ignored). One+ per op; both shift cross-word cases pinned (i64_shl_big_case_zeroes_low_half, i64_shl_small_case_uses_register_shifts_and_carry_or), the clz hi-vs-lo branch pinned, signed + unsigned compares both covered.

Note on the diff

git diff --stat reports 2312 insertions / 699 deletions, but no functions or tests were removed — verified: all 108 functions from main are present in the branch (now 145, +37). The "deletions" are blank-line reflow artifacts from cargo fmt interleaving with the large additions. Function-name set diff main → branch is empty for removals.

Validation

  • cargo test --package synth-backend-riscv — 148 pass, 0 fail, 1 ignored.
  • cargo clippy --package synth-backend-riscv --all-targets -- -D warnings — clean.
  • cargo fmt --check — clean.

Follow-ups

  • i64 div/rem (Phase 3) — needs a __divdi3-style runtime or a runtime-call path.
  • Sub-word i64 loads (i64.load8_s etc.) still unimplemented — noted in the module doc.

🤖 Generated with Claude Code

Extends the RV32IMAC instruction selector with the harder i64 surface,
building on the Phase-1 typed-vstack register-pair representation.

Implemented (each lowering to an RV32IMAC sequence):
- I64Mul — low-64 product via mul + mulhu carry + 2 cross-term muls
- I64Shl / I64ShrS / I64ShrU — runtime-amount shifts with a data-dependent
  branch on shamt >= 32 (cross-word) vs. < 32 (within-word + carry)
- I64Rotl / I64Rotr — composed from a pair of cross-word shifts ORed together
- I64Clz / I64Ctz / I64Popcnt — base-ISA software sequences (no Zbb):
  clz/ctz branch on the hi/lo half and use an unrolled binary-search
  clz_word; ctz reuses clz via `x & -x`; popcnt is the SWAR mul-collapse
- I64LtS/LtU/LeS/LeU/GtS/GtU/GeS/GeU — hi-then-lo compare ladder (hi signed
  or unsigned per op, lo always unsigned), reduced to less-than + invert
- I64Extend8S / I64Extend16S / I64Extend32S — sub-word sign extension with
  sign propagation into the high word via srai 31

Deferred to Phase 3:
- I64DivS / I64DivU / I64RemS / I64RemU — RV32 has no 64-bit divide; these
  need a __divdi3-style software long-division routine. They fall through
  to the existing `Unsupported` arm — fail loudly, no silent miscompile.

Tests: 23 new shape-assertion tests (one+ per implemented op, both shift
cross-word cases, the clz hi-vs-lo branch, deferred-op Unsupported check).

Validation: cargo test (148 pass, was 125), clippy -D warnings clean,
cargo fmt --check clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant