chore: Add version bump automation script#2519
Conversation
|
Hey @smamindl 👋! We use semantic commit messages to streamline the release process. Examples of commit messages with semantic prefixes:
To test your commit locally, please follow our guild on building from source. |
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Snapshot WarningsEnsure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice. Scanned FilesNone |
There was a problem hiding this comment.
Pull request overview
Adds a repo-wide version bump automation script (scripts/bump-version.py) intended to safely update SynapseML version strings across the repository while avoiding partial matches and protected paths (e.g., versioned_docs/, package.json).
Changes:
- Introduces a Python CLI tool to search/replace exact version strings using a boundary-aware regex.
- Adds denylist-based directory/file skipping, plus dry-run and verbose reporting modes.
- Implements pre-run “regex safety checks” to validate matching behavior against edge cases.
| try: | ||
| content = file_path.read_text(encoding="utf-8") | ||
| except (UnicodeDecodeError, PermissionError): | ||
| return 0, [] | ||
|
|
||
| matches = list(regex.finditer(content)) | ||
| if not matches: | ||
| return 0, [] | ||
|
|
||
| changes = [] | ||
| lines = content.split("\n") | ||
| seen_lines = set() | ||
| for match in matches: | ||
| line_num = content[:match.start()].count("\n") + 1 | ||
| if line_num not in seen_lines: | ||
| seen_lines.add(line_num) | ||
| line_text = lines[line_num - 1].strip() | ||
| # Truncate long lines, showing context around the version | ||
| if len(line_text) > 120: | ||
| idx = line_text.find(old_version) | ||
| if idx >= 0: | ||
| start = max(0, idx - 40) | ||
| end = min(len(line_text), idx + len(old_version) + 40) | ||
| line_text = "..." + line_text[start:end] + "..." | ||
| changes.append(f" L{line_num}: {line_text}") | ||
|
|
||
| count = len(matches) | ||
|
|
||
| if not dry_run: | ||
| new_content = regex.sub(new_version, content) | ||
| file_path.write_text(new_content, encoding="utf-8") | ||
|
|
There was a problem hiding this comment.
Path.read_text() / write_text() in text mode will normalize line endings (e.g., CRLF -> LF) when rewriting files, which can create large noisy diffs unrelated to the version bump. Consider preserving original newlines (e.g., opening with newline='' and writing back with the same newline style, or operating on bytes while keeping the original line endings) so the script only changes the version string.
82f5f3d to
6e7a37b
Compare
Adds scripts/bump-version.py for safely updating version strings across the repo during releases. Uses word-boundary-aware regex to prevent partial matches (e.g., won't corrupt 1.0.115 when bumping 1.0.11). Features: - Dry-run mode to preview changes - Denylist for versioned_docs/, package.json, etc. - Regex safety verification before applying - Summary report of all changes Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Use read_bytes/write_bytes instead of read_text/write_text to avoid CRLF -> LF normalization that would create noisy diffs unrelated to the version bump. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Redesigned version bump from bare regex to context-anchored substitution. Every replacement is proven SynapseML-related by surrounding text — safe for fully unattended automated releases. Features: - 20 context patterns (self/line/file-anchored) - Auto-detects current version from docusaurus.config.js - Enforces X.Y.Z format, version must increase - Two-pass: read all → validate → write all - Post-condition self-verification (re-reads files after write) - Broad sweep warns about old version in unscanned files - Runs docusaurus docs:version when website/docs/ exists - 175 tests: unit, integration, fuzz, adversarial, round-trip, snapshot regression, and 10 historical bump replays Usage: python scripts/bump-version.py --to 1.1.3 [--dry-run] Co-authored-by: smamindl <106691906+smamindl@users.noreply.github.com>
6e7a37b to
dc8e98c
Compare
Summary
Adds
scripts/bump-version.py— a release automation tool that safely updates version strings across the repo during version bumps, andscripts/test_bump_version.py— a 175-test suite covering unit, integration, fuzz, adversarial, round-trip, snapshot, and historical replay tests.Problem
Version bumps currently require manual find-and-replace across the repo, which is error-prone:
1.0.115when bumping1.0.11)versioned_docs/(historical snapshots)package.json(npm deps)Solution
A context-anchored substitution system that replaces version strings only when they appear within a known SynapseML-specific context. A bare version like
1.1.0is never blindly replaced — every replacement must be anchored by surrounding text that proves it refers to SynapseML (e.g.,synapseml_2.12:1.1.0,SYNAPSEML_VERSION=1.1.0).If a version reference has no SynapseML anchor, the script refuses to run rather than guessing.
Usage
Features
docusaurus.config.jsdocs:versionwhenwebsite/docs/existsTest Suite (175 tests)