Skip to content

Prune integrated commits at or below the target from all stacks#13627

Open
mtsgrd wants to merge 1 commit intomasterfrom
improve-integrated-pruning
Open

Prune integrated commits at or below the target from all stacks#13627
mtsgrd wants to merge 1 commit intomasterfrom
improve-integrated-pruning

Conversation

@mtsgrd
Copy link
Copy Markdown
Contributor

@mtsgrd mtsgrd commented May 5, 2026

Summary

Rewrites integrated-commit pruning so that all stacks are pruned equally, but only removes integrated commits at or below the workspace's target_commit_id.

What changed

  • Removed the is_metadata_tracked skip — metadata-tracked stacks are no longer exempt from pruning
  • Rewrote truncate_at_fork_point — determines "at or below target" via graph reachability (walking outward from the target segment to collect a HashSet of commit IDs), not index comparisons. When a segment is fully emptied by pruning, it is removed entirely.
  • Fixed target-in-segment assumption — the old code assumed the target commit was always the first commit in its segment; the new code finds the target's actual position and only includes commits from that point downward
  • Metadata-tracked empty branches are preserved — a branch just added at the fork point with no commits yet is kept, but branches emptied by pruning are removed
  • Removed is_metadata_tracked and find_commit_in_stack helpers — no longer needed
  • Added regression testintegrated_commits_above_target_are_kept verifies both that integrated commits above the target survive, and that they are pruned once the target advances

Why

Two problems with the old behavior:

  1. Inconsistent — auto-discovered stacks got pruned but metadata-tracked stacks did not, leaving fully-integrated commits visible in the workspace
  2. Too aggressive — pruning based solely on the Integrated flag could remove branches whose work was merged upstream before the user ran integrate, preventing integrate_upstream from detecting and processing them

How it works

The pruning boundary is the workspace's target_commit_id — the SHA stored in metadata that stays fixed until the user integrates upstream.

To determine which commits are "at or below" the target, we:

  1. Walk the graph outward (Direction::Outgoing) from the target segment, collecting all reachable commit IDs into a HashSet
  2. Slice the target segment itself from the target commit's actual position downward — commits above the target in the same segment are not included

Then for each stack, the contiguous integrated tail whose commits are in the set is removed:

commit C        ← kept (above target, even if integrated)
commit B        ← kept (above target, even if integrated)
─── target ───
commit A        ← pruned (integrated + in HashSet)
base            ← pruned (integrated + in HashSet)

When a branch is merged upstream but the user hasn't integrated yet, the target hasn't moved — so the branch and its commits stay visible. Once the user integrates and the target advances, they are pruned automatically.

Fully-integrated stacks are cleaned up entirely. Empty branches that existed before pruning (e.g. a newly added branch at the fork point) are preserved if they appear in workspace metadata.

Fallback

When target_commit metadata is not set (e.g. in tests or older workspaces), the target segment is resolved from target_ref. This keeps pruning working in fixtures that don't set target_commit_id.

@github-actions github-actions Bot added rust Pull requests that update Rust code CLI The command-line program `but` labels May 5, 2026
@mtsgrd mtsgrd force-pushed the improve-integrated-pruning branch 2 times, most recently from f3ed8c2 to 85c2204 Compare May 5, 2026 01:01
@mtsgrd mtsgrd changed the title Prune integrated tail from all stacks uniformly Prune integrated commits at or below the target from all stacks May 5, 2026
@mtsgrd mtsgrd force-pushed the improve-integrated-pruning branch 3 times, most recently from 48e31b8 to 2dbc25f Compare May 5, 2026 16:09
@mtsgrd mtsgrd marked this pull request as ready for review May 5, 2026 16:15
Copilot AI review requested due to automatic review settings May 5, 2026 16:15
@mtsgrd
Copy link
Copy Markdown
Contributor Author

mtsgrd commented May 5, 2026

@Caleb-T-Owens your pr #13632 makes the tests here green. 🙏

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the workspace projection’s “integrated commit pruning” so pruning applies to all stacks (including metadata-tracked stacks) and attempts to prune only integrated commits that are at-or-below a target boundary, with corresponding snapshot updates in but-graph tests.

Changes:

  • Reworked prune_integrated_segments() / truncate_at_fork_point() to prune an integrated tail using a computed “target commit” boundary.
  • Removed the metadata-tracked stack skip so pruning is applied uniformly across stacks.
  • Updated insta snapshots to reflect the new pruning/truncation behavior and simplified segment cleanup.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
crates/but-graph/src/projection/workspace/init.rs Rewrites integrated pruning logic to use a target-boundary and truncates stack segments accordingly.
crates/but-graph/tests/graph/init/with_workspace.rs Updates workspace graph snapshots to match the new pruning behavior.

Comment thread crates/but-graph/src/projection/workspace/init.rs Outdated
Comment thread crates/but-graph/src/projection/workspace/init.rs Outdated
@mtsgrd mtsgrd force-pushed the improve-integrated-pruning branch from 2dbc25f to f6ca178 Compare May 5, 2026 16:42
Copilot AI review requested due to automatic review settings May 5, 2026 17:50
@mtsgrd mtsgrd force-pushed the improve-integrated-pruning branch from f6ca178 to a1fbed3 Compare May 5, 2026 17:50
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Comment thread crates/but-graph/src/projection/workspace/init.rs Outdated
Comment thread crates/but-graph/src/projection/workspace/init.rs Outdated
Comment thread crates/but-graph/src/projection/workspace/init.rs
@mtsgrd mtsgrd force-pushed the improve-integrated-pruning branch from a1fbed3 to 75c6781 Compare May 5, 2026 18:14
Comment on lines +1001 to +1008
for &(graph_sidx, ofs) in seg.commits_by_segment.iter().rev() {
for (i, commit) in graph[graph_sidx].commits.iter().enumerate().rev() {
let offset = ofs + i;
let above_target = seg_idx < target_seg_idx
|| (seg_idx == target_seg_idx && offset < target_offset);
if above_target {
break 'outer;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We absolutely can not do comparisons on indices to determine this stuff.

We provide no guarantees in the order that the but graph is created. Even the order of edges can't be relied on.

I would encourage you to take an approach where you get a HashSet of all the commits beneath your target_commit, then loop over the segments, dropping commits or full segments if they're fully included in the HashSet

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you just look at commits, empty refs will be ambiguous, but your traversal of the graph (visit_all_segments_including_start_until) will also let you gather the relevant reference names too

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've pushed a new version!

Copilot AI review requested due to automatic review settings May 6, 2026 10:05
@mtsgrd mtsgrd force-pushed the improve-integrated-pruning branch 2 times, most recently from dfa1e84 to ccb515e Compare May 6, 2026 10:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Comment thread crates/but-graph/src/projection/workspace/init.rs
Comment thread crates/but-graph/src/projection/workspace/init.rs Outdated
Comment thread crates/but-graph/tests/fixtures/scenarios.sh Outdated
@mtsgrd mtsgrd force-pushed the improve-integrated-pruning branch from ccb515e to c0389f1 Compare May 6, 2026 10:19
Copilot AI review requested due to automatic review settings May 6, 2026 10:35
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment thread crates/but-graph/src/projection/workspace/init.rs Outdated
@mtsgrd mtsgrd force-pushed the improve-integrated-pruning branch 2 times, most recently from f56d615 to e256a70 Compare May 6, 2026 13:55
Previously, `prune_integrated_segments` skipped metadata-tracked stacks
entirely and walked top-down looking for the first integrated commit.
This had two problems:

- Integrated commits *above* the target were incorrectly pruned, even
  though the user hasn't advanced the target via upstream integration.
- The `is_metadata_tracked` guard hid all integrated state from those
  stacks, preventing `integrate_upstream` from detecting them.

Replace the approach with HashSet-based graph reachability:

1. Walk the graph from the target segment outward
   (`visit_all_segments_excluding_start_until`) to collect all commit
   IDs that are ancestors of the target.
2. Slice the target segment itself from the target commit's actual
   position downward — this avoids the assumption that the target is
   always the first commit in its segment.
3. Walk each stack bottom-up, pruning the contiguous integrated tail
   only where commits are in the `at_or_below_target` set. When a
   segment is fully emptied by pruning, remove it entirely.
4. Preserve metadata-tracked empty branches that were already empty
   (e.g. a branch just added at the fork point with no commits yet),
   but remove everything else that ends up empty after pruning.

Add a regression test (`integrated_commits_above_target_are_kept`) that
verifies integrated commits above the target survive pruning.

Preserve user-added empty branches at fork point

When truncating stacks to the fork point we previously removed any empty
segments, which also discarded branches that the user had just created
at the fork point but that had no commits yet. This change defers
removal of empty segments to a new helper that keeps segments if they
are referenced in workspace metadata (i.e. a branch the user added at
the fork point). It also correctly handles trimming a segment that is
completely pruned by removing the segment rather than leaving an empty
one.
Copilot AI review requested due to automatic review settings May 6, 2026 16:16
@mtsgrd mtsgrd force-pushed the improve-integrated-pruning branch from e256a70 to f56c6ab Compare May 6, 2026 16:16
@mtsgrd
Copy link
Copy Markdown
Contributor Author

mtsgrd commented May 6, 2026

Here's some context on each changed snapshot!

Snapshot 1: no_overzealous_stacks_due_to_workspace_metadata (but-graph)

Git graph:

W2 (835086d) ── W1 (ff310d3) ── M3 (a821094, origin/main, feat-2) ── M2 ── M1 (3183e43)
X2 (0b203b5) ── X1 (4840f3b, origin/X) ── M1 (3183e43)

Metadata: stack 1 = X, stack 2 = feat-2. Both InWorkspace.

-└── ≡:7:anon: on a821094 {2}
-    └── :7:anon:
-        ├── ·835086d (🏘️) ►four, ►three
-        └── ·ff310d3 (🏘️)
+└── ≡:7:anon: on 3183e43 {2}
+    ├── :7:anon:
+    │   ├── ·835086d (🏘️) ►four, ►three
+    │   └── ·ff310d3 (🏘️)
+    └── 📙:2:feat-2

feat-2 points at M3 (= origin/main), fully integrated. Previously its segment was
destroyed by truncate, so remove_empty_branches never saw it. Now the segment
survives and the branch is preserved because it's metadata-tracked. The stack base
shifts from a821094 to 3183e43 because the lower segments are now visible.


Snapshots 2–5: dlib_rs_auto_fix (but-meta)

Legacy scenario from a real dlib-rs repo. Stack 5 tracks main.

Snapshot 2 (line 1272, pre-reconciliation workspace)

-📕🏘️:0:gitbutler/workspace <> ✓refs/remotes/origin/main on bce0c5e
+📕🏘️:0:gitbutler/workspace <> ✓refs/remotes/origin/main on 3183e43
+└── ≡📙:5:main[🌳] <> origin/main →:1: on 3183e43 {1}
+    └── 📙:5:main[🌳] <> origin/main →:1:

Previously main's segment was destroyed, the stack became empty, and
stacks.retain removed it. Now the segment survives and the metadata-tracked
main branch is preserved.

Snapshot 3 (line 1292, reconciled target SHA)

-sha: Sha1(bce0c5efc577b90e52a8ba20c4c41621af3134d3),
+sha: Sha1(3183e43ff482a2c4c8ff531d595453b64f58d90b),

Downstream effect: the workspace lower bound now includes the preserved main
stack which reaches down to M1, so reconciliation adjusts the target SHA.

Snapshot 4 (line 1307, post-reconciliation workspace)

-📕🏘️:0:gitbutler/workspace <> ✓refs/remotes/origin/main on bce0c5e
+📕🏘️:0:gitbutler/workspace <> ✓refs/remotes/origin/main on 3183e43
+└── ≡📙:2:main[🌳] <> origin/main →:1: on 3183e43 {1}
+    └── 📙:2:main[🌳] <> origin/main →:1:
+        └── ❄️bce0c5e (🏘️|✓)

After reconciliation the target moved to 3183e43, so bce0c5e is now above
the target. The pruning rule keeps integrated commits above the target — they'll
be pruned once the target advances past them.

Snapshot 5 (line 1317, reconciled stack metadata)

             ],
-            workspacecommit_relation: Outside,
+            workspacecommit_relation: Merged,
+        },
+        WorkspaceStack {
+            id: 2,
+            branches: [
+                WorkspaceStackBranch {
+                    ref_name: "refs/heads/main",
+                    archived: false,
+                },
+            ],
+            workspacecommit_relation: Outside,
         },

Reconciliation now sees main in the workspace projection and creates a stack
entry for it. Stack 1 changed from Outside to Merged because its content is now
recognized as merged relative to the newly visible main stack.


Appendix: Emoji legend

Segment metadata

Emoji Meaning
📕 Workspace segment (SegmentMetadata::Workspace)
📙 Branch segment (SegmentMetadata::Branch) — metadata-tracked
(none) Anonymous segment — no metadata association

Commit kind (push status)

Emoji Meaning
❄️ ReachableByMatchingRemote — pushed to own remote (force-push req)
ReachableByRemote — pushed to some other remote
🟣 RemoteOnly — exists only on remote
· Local only — not pushed

Commit flags (in parentheses)

Symbol Flag Meaning
🏘️ InWorkspace Reachable from a workspace tip
Integrated Reachable from the target branch (e.g. origin/main)

Stack & ref symbols

Symbol Meaning
Stack header prefix
[🌳] Branch is checked out in the main worktree
Extra refs at the same commit (e.g. ►four, ►three)
⇡N N local commits ahead of remote tracking branch
on <sha> Stack base commit
{N} Stack ID from workspace metadata

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLI The command-line program `but` rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants