Skip to content

feat(cache): add explicit inputs config for cache fingerprinting#104

Merged
branchseer merged 32 commits intomainfrom
01-14-feat_explicit_inputs
Mar 9, 2026
Merged

feat(cache): add explicit inputs config for cache fingerprinting#104
branchseer merged 32 commits intomainfrom
01-14-feat_explicit_inputs

Conversation

@branchseer
Copy link
Member

@branchseer branchseer commented Jan 13, 2026

Summary

  • Add inputs field to task configuration for explicit cache fingerprinting
  • Support glob patterns, auto-inference from fspy, negative patterns, and mixed mode
  • Expand trailing / in globs to /** (e.g., "src/""src/**")
  • Bare directory names (e.g., "src" without /) match nothing — only files are fingerprinted
  • Remove fstat interception in fspy (fd was already tracked via open)
  • Skip duplicate files across overlapping glob patterns via entry API

Input modes

  • Explicit globs: inputs: ["src/**/*.ts"]
  • Auto-inference: inputs: [{ "auto": true }]
  • Negative patterns: inputs: ["src/**", "!**/*.test.ts"]
  • Directory shorthand: inputs: ["src/"] (expands to "src/**")
  • Mixed: inputs: ["package.json", { "auto": true }, "!dist/**"]
  • Empty (no file tracking): inputs: []

Test plan

  • Plan snapshot: inputs-trailing-slash verifies src/src/** and !dist/dist/**
  • E2E: all input combinations (positive, negative, auto, mixed, empty)
  • E2E: folder-slash-input — cache miss on direct and nested file changes, hit on outside
  • E2E: folder-input — bare directory name fingerprints nothing
  • E2E: glob meta chars in package paths (packages/[lib])
  • E2E: cross-package .. globs in subpackages
  • Unit tests: overlapping globs deduplicate, negative exclusions, sibling packages

🤖 Generated with Claude Code

Copy link
Member Author

branchseer commented Jan 13, 2026

@branchseer branchseer changed the base branch from 01-14-e2e_timeout_support to graphite-base/104 January 14, 2026 03:59
@branchseer branchseer force-pushed the 01-14-feat_explicit_inputs branch from 2068511 to fa3e209 Compare January 14, 2026 04:02
@branchseer branchseer changed the base branch from graphite-base/104 to 01-14-e2e_timeout_support January 14, 2026 04:02
@branchseer branchseer changed the base branch from 01-14-e2e_timeout_support to graphite-base/104 January 14, 2026 14:54
@branchseer branchseer force-pushed the 01-14-feat_explicit_inputs branch 2 times, most recently from d0e61ec to 379326a Compare March 5, 2026 04:15
@branchseer branchseer changed the base branch from graphite-base/104 to claude/plan-workspace-root-g14Nn March 5, 2026 04:15
@branchseer branchseer force-pushed the 01-14-feat_explicit_inputs branch 2 times, most recently from 419d488 to 01a285f Compare March 5, 2026 04:46
@branchseer branchseer changed the base branch from claude/plan-workspace-root-g14Nn to graphite-base/104 March 5, 2026 06:13
@branchseer branchseer force-pushed the 01-14-feat_explicit_inputs branch from 0b05438 to cacdc11 Compare March 5, 2026 06:15
@graphite-app graphite-app bot changed the base branch from graphite-base/104 to main March 5, 2026 06:15
@branchseer branchseer force-pushed the 01-14-feat_explicit_inputs branch from cacdc11 to d36735f Compare March 5, 2026 06:15
@branchseer branchseer requested a review from Copilot March 5, 2026 06:29
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an explicit inputs field to the task configuration for controlling cache fingerprinting. Instead of always using fspy (file system spy) to infer file dependencies, users can now specify glob patterns to explicitly declare which files should trigger cache invalidation, or use a combination of auto-inference and explicit patterns.

Changes:

  • Add UserInputEntry, UserInputsConfig, and ResolvedInputConfig types for parsing and normalizing user input configuration, with support for glob patterns, auto-inference directives, and negative patterns
  • Replace the monolithic SpawnTrackResult with separate std_outputs and TrackedPathAccesses to decouple output capturing from fspy tracking, and introduce PreRunFingerprint (combining spawn fingerprint + globbed inputs) as the new cache key, replacing the old SpawnFingerprint-based key
  • Add glob_inputs.rs module for walking glob patterns and hashing files, with comprehensive unit tests and e2e test fixtures covering all input configuration combinations including subpackage and cross-package (..) scenarios

Reviewed changes

Copilot reviewed 137 out of 141 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
crates/vite_task_graph/src/config/user.rs Adds UserInputEntry enum, UserInputsConfig type, inputs field to EnabledCacheConfig, and unit tests
crates/vite_task_graph/src/config/mod.rs Adds ResolvedInputConfig with from_user_config resolution logic and unit tests
crates/vite_task_graph/run-config.ts TypeScript type definition for the inputs field
crates/vite_task_graph/Cargo.toml Adds bincode dependency for ResolvedInputConfig serialization
crates/vite_task_plan/src/cache_metadata.rs Adds input_config and glob_base to CacheMetadata, removes fingerprint_ignores from SpawnFingerprint
crates/vite_task_plan/src/plan.rs Passes package_path to plan_spawn_execution for glob resolution base
crates/vite_task/src/session/execute/glob_inputs.rs New module for glob-based input file discovery and hashing
crates/vite_task/src/session/execute/fingerprint.rs Updates PostRunFingerprint to use inferred_inputs with negative glob filtering
crates/vite_task/src/session/execute/spawn.rs Splits SpawnTrackResult into TrackedPathAccesses and separate std_outputs
crates/vite_task/src/session/execute/mod.rs Orchestrates globbed input computation, conditional fspy tracking, and updated cache operations
crates/vite_task/src/session/cache/mod.rs Introduces PreRunFingerprint as cache key, bumps DB version to 7, updates cache hit/update logic
crates/vite_task/src/session/cache/display.rs Removes FingerprintIgnoreAdded/Removed change variants
crates/vite_task/Cargo.toml Replaces vite_glob with wax and path-clean, adds tempfile for tests
crates/fspy/src/unix/mod.rs Sets FSPY=1 env when fspy tracking is enabled
crates/fspy/src/windows/mod.rs Sets FSPY=1 env when fspy tracking is enabled
packages/tools/src/print-file.ts Supports reading multiple files for e2e test commands
docs/inputs.md Comprehensive documentation for the inputs configuration
crates/vite_task_bin/src/main.rs Adds inputs: None to default EnabledCacheConfig
crates/vite_task_bin/src/lib.rs Adds inputs: None to synthesized EnabledCacheConfig instances
E2e fixture files Test fixtures for inputs-cache-test, glob-base-test, inputs-negative-glob-subpackage

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@branchseer branchseer force-pushed the 01-14-feat_explicit_inputs branch 2 times, most recently from a3484e6 to d1320a2 Compare March 8, 2026 15:19
branchseer and others added 4 commits March 8, 2026 23:59
Add `inputs` field to task configuration supporting:
- Explicit glob patterns: `inputs: ["src/**/*.ts"]`
- Auto-inference from fspy: `inputs: [{ auto: true }]`
- Negative patterns: `inputs: ["src/**", "!**/*.test.ts"]`
- Mixed mode: `inputs: ["package.json", { auto: true }, "!dist/**"]`
- Empty array to disable file tracking: `inputs: []`

Key changes:
- Add `ResolvedInputConfig` to parse and normalize user input config
- Add `glob_inputs.rs` for walking glob patterns and hashing files
- Update `PreRunFingerprint` to include `input_config` and `glob_base`
- Bump cache DB version to 6 for new fingerprint structure
- Add comprehensive e2e tests for all input combinations

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…nings

Drop redundant suffixes from FingerprintMismatch variants (e.g.
SpawnFingerprintMismatch → SpawnFingerprint) to fix enum_variant_names
lint. Also fix if_not_else and doc_markdown warnings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… framing

The cache key and value are no longer conceptually divided as pre-run
vs post-run fingerprints. Update doc comments to describe what each
struct contains directly, and remove stale `# Arguments` sections.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
branchseer and others added 20 commits March 8, 2026 23:59
Replace the ResolvedNegativeGlob tuple type alias with a proper
AnchoredGlob struct that encapsulates glob partitioning, path cleaning,
and prefix-based matching behind a clean API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Walk positive globs using wax, rerooting negative globs onto a common
ancestor so wax can prune entire directory subtrees via `.not()`. This
handles all prefix relationship cases (equal, ancestor, descendant,
unrelated) correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move path_bridge, rerooted_pattern, common_ancestor, and escape_glob
from walk.rs into anchored.rs, exposed as AnchoredGlob::reroot() and
has_related_prefix() methods. This simplifies the walk module by
encapsulating glob rerooting within the type that owns the data.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move AnchoredGlob and the filesystem walk logic out of vite_glob,
leaving only the core glob matching and error types.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…aph stage

Move glob pattern resolution from execution time to task graph construction,
making all glob patterns workspace-root-relative. This eliminates AnchoredGlob
usage, removes glob_base from CacheMetadata/CacheEntryKey, and simplifies the
execution pipeline.

Key changes:
- Add resolve_glob_to_workspace_relative() in vite_task_graph config
- Remove glob_base field from CacheMetadata and CacheEntryKey
- Update glob_inputs to work with workspace-root-relative patterns
- Update spawn.rs negative glob filtering with path cleaning
- Bump cache DB version from 9 to 10
- Remove vite_glob dependency from vite_task
- Remove plan file

https://claude.ai/code/session_01PR9yhnScRoVoHUcviV47u5
Move path_clean::PathClean normalization into the strip_path_prefix
callback so all fspy-reported paths are clean from the start. This
removes the need for separate cleaning in the negative glob filter
and in PostRunFingerprint::create.

https://claude.ai/code/session_01PR9yhnScRoVoHUcviV47u5
Move .git filtering and negative glob filtering into the
strip_path_prefix callback alongside path cleaning, so rejected
paths return None immediately in one pass.

https://claude.ai/code/session_01PR9yhnScRoVoHUcviV47u5
The cleaned path (with `..` normalized) is used solely for matching
against negative globs. The original stripped path is returned and
stored in the fingerprint, keeping create/validate consistent.

Restore fingerprint.rs to use the path as-is since it already contains
the original fspy-reported relative path.

https://claude.ai/code/session_01PR9yhnScRoVoHUcviV47u5
…irect path_clean usage

Move path_clean dependency into vite_path and expose it through typed
clean() methods on AbsolutePath and RelativePath, documenting the
symlink limitation of purely lexical normalization.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…kspace_relative

RelativePathBuf guarantees valid UTF-8 with forward slashes, so the
manual to_str() check and backslash normalization are no longer needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…x .not() combinator

Replace manual partition()+branch logic with direct glob.walk(workspace_root),
and replace manual is_match negative filtering with wax's FileIterator::not().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fstat(fd) uses F_GETPATH on macOS which returns canonical paths, while
open(path) preserves raw relative paths. This caused the same file to be
recorded under two different paths, leading to non-deterministic cache
miss messages via rayon's find_map_any.

Also simplify glob_inputs negation handling (always use .not()) and add
e2e test for glob meta characters in package paths (wax::escape).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Verifies that inputs: ["src"] (a directory, not a glob) fingerprints
nothing — file changes inside and folder deletion are both cache hits.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Input patterns like `"src/"` are now treated as `"src/**"`, matching all
files recursively under that directory. This applies to both positive and
negative globs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…-8 paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@branchseer branchseer force-pushed the 01-14-feat_explicit_inputs branch from bc92cfd to 4311180 Compare March 8, 2026 16:00
branchseer and others added 2 commits March 9, 2026 00:28
fspy reports paths with `..` components (e.g. `packages/sub-pkg/../shared/src/utils.ts`)
on macOS but normalized paths on Linux/Windows. Always clean `..` before storing to ensure
consistent cache miss messages across platforms.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@branchseer branchseer marked this pull request as ready for review March 8, 2026 21:49
@branchseer branchseer requested a review from fengmk2 March 8, 2026 21:49
Copy link
Member Author

branchseer commented Mar 9, 2026

Merge activity

  • Mar 9, 1:15 AM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Mar 9, 1:15 AM UTC: @branchseer merged this pull request with Graphite.

@branchseer branchseer merged commit 09f1343 into main Mar 9, 2026
7 checks passed
@branchseer branchseer deleted the 01-14-feat_explicit_inputs branch March 9, 2026 01:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants