Skip to content

WIP: Agentic deploy scripts, Dockerfile.dev, and CRD fixes#1582

Draft
harche wants to merge 26 commits intoopenshift:mainfrom
harche:wt/e2e-testing
Draft

WIP: Agentic deploy scripts, Dockerfile.dev, and CRD fixes#1582
harche wants to merge 26 commits intoopenshift:mainfrom
harche:wt/e2e-testing

Conversation

@harche
Copy link
Copy Markdown

@harche harche commented Apr 30, 2026

Summary

  • Add agentic deploy scripts for building and deploying proposal components
  • Integrate agentic controller, console reconciler, and deploy scripts
  • Add Dockerfile.dev for local module builds, fix deploy scripts
  • Fix deploy scripts and CRDs for e2e testing
  • Align deploy scripts and Makefile with agentic API review changes

Status

🚧 Work in progress — not ready for review

🤖 Generated with Claude Code

harche and others added 5 commits April 28, 2026 11:07
…ents

Unified deploy-agentic.sh with --provider=vertex|bedrock flag, plus 5
redeploy scripts for fast iteration on individual components. Shared
library (agentic-lib.sh) handles worktree-aware image tagging, registry
push via skopeo, operator pause/resume, and proposal API chain setup
using the new v1alpha1 CRDs (LLMProvider, Agent, ComponentTools, Workflow).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Import lightspeed-agentic-operator as Go module (replace → harche fork)
- Register agentic scheme + ProposalReconciler in main.go (--enable-agentic)
- Add console reconciler: operator deploys agentic console plugin at startup
  via --agentic-console-image flag (Deployment, Service, ConsolePlugin, activation)
- Generate agentic CRDs via manifests-agentic Makefile target
- Add config/rbac-agentic/ with ClusterRole + ClusterRoleBinding
- Deploy scripts: podman/docker auto-detect, base SandboxTemplate,
  --enable-agentic + --agentic-console-image injection, undeploy script
- Update agent-sandbox controller to v0.4.2
- Fix VERTEX_REGION default (us-east5 → global), mktemp macOS compat,
  build error visibility, Containerfile for agent sandbox

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Dockerfile.dev: builds operator using local lightspeed-agentic-operator
  source instead of fetching from GitHub. No push needed for dev iteration.
- Fix deploy scripts: correct console deployment name, show rollout errors,
  auto-tag images as :latest in worktrees, remove dead chat pod restart,
  fix bash local variable scoping.
- Regenerate CRDs for ProposalTemplate + updated Proposal spec.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Deploy script fixes:
- Use Dockerfile.dev (local agentic-operator) in build_push_operator
- Grant image-puller to console SA for internal registry
- Switch SandboxTemplate to HTTP probes, remove TLS config
- Add networkPolicyManagement: Unmanaged
- Fix redeploy-agentic-agent.sh to use Containerfile
- Replace old ComponentTools/Workflow CRs with ProposalTemplate
- Add Day 0 timeline comments from CRD design doc

CRD updates:
- Add immutability CEL rules on Proposal spec fields
- Add success field to execution/verification step status
- Include component owner and admin RBAC roles

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Makefile: use full module path for agentic API (separate Go module)
- Dockerfile.dev: add replace directive for agentic API submodule
- deploy-agentic.sh: LLMProvider YAML uses discriminated union format
  (GoogleCloudVertex/AWSBedrock with per-provider config blocks)
- agentic-lib.sh: remove ProposalTemplate step (CRD removed), rename
  lightspeed-chat→lightspeed-agent, auto-select Containerfile.dev on ARM
- redeploy-agentic-agent.sh: use Containerfile.dev on ARM, remove stale
  pod restart logic (sandbox pods are ephemeral per-proposal)
- redeploy-agentic-all.sh: same stale pod cleanup removal
- Regenerated CRDs from updated agentic-operator types
- go.mod: add replace for agentic API submodule

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 30, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 30, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 30, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 30, 2026

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 30, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign raptorsun for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

harche and others added 20 commits May 2, 2026 17:02
…, cleanup

- Move all agentic scripts from hack/ to hack/agentic/ with cleaner names
- Add hack/agentic/CLAUDE.md for auto-discovery by Claude Code
- Add 3 missing CRDs to kustomization (approvalpolicies, proposalapprovals, proposaltemplates)
- Switch from local docker builds to OpenShift BuildConfigs (on-cluster builds)
- Build agent + skills in parallel via start_build_async + wait_all_builds
- Construct minimal operator build context (only operator + agentic-operator
  repos, not entire workspace root) — reduces upload from ~500MB to ~50MB
- Drop 4 per-profile skills images (design, remediate, escalate, monitor) —
  SkillsSource.paths handles selective mounting from a single full image
- Add configurable repo paths via env vars (AGENTIC_OPERATOR_DIR, AGENT_DIR, etc.)
- Remove 5 dead functions, extract update_crds_and_rbac shared helper
- Fix --help crash on redeploy scripts, update stale path references
- Batch build status polling (1 API call vs N per cycle)
- Replace TLS secret poll loop with oc wait --for=create

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix gcloud SA key creation: tolerate org policy warnings, clear stale
  key files before creation
- Fix CEL validation: has() guard for omitempty denied field on
  ProposalApproval (regenerated via make manifests-agentic)
- Add revisionFeedback field to Proposal CRD
- Change default ApprovalPolicy to all-Manual
- Add JVM OOMKill demo with PrometheusRule and Proposal
- Improve deploy.sh --with-demo messaging

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…icationResult

Generated via make manifests-agentic. Adds three new CRD YAML files,
updates Proposal CRD (inline step data removed from status schema),
adds get/list/watch/create RBAC for the new resources, and registers
them in the CRD kustomization.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add EscalationResult CRD to kustomization (was missing, blocking
  proposal controller startup)
- Regenerate CRD manifests via make manifests-agentic
- Add agentic.openshift.io read RBAC to lightspeed-agent-reader
  ClusterRole so escalation agents can read prior step results
- Bump envtest from 1.27.1 to 1.32.0 to support format.dns1123Label()
  CEL validation in CRDs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Regenerate agentic CRDs to reflect result CR status subresource with
conditions, removed outcome/timestamp fields, and ApprovalPolicy
defaultOption with CEL validation. Fix deploy script temp dir cleanup
for read-only envtest binaries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Regenerate CRDs for API type changes (maxAttempts, revision, tools, etc.)
- Move maxAttempts from proposal examples to ApprovalPolicy
- Update RBAC comment for revisionFeedback

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ProposalApproval spec reverted to optional
- Deploy script inline proposal: mountAs struct, execution agent

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Regenerate Proposal and AnalysisResult CRDs for OutputSchema on
  ProposalSpec and flexible Components on RemediationOption
- JVM OOMKill demo proposal now includes outputSchema requiring
  jvmHeapConfig and containerMemory structured components

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Regenerate Proposal and AnalysisResult CRDs for the OutputSchema to
AnalysisOutput migration. Update JVM demo in deploy script to use
analysisOutput.schema, and add a Minimal mode demo proposal alongside it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Regenerate ApprovalPolicy CRD for maxConcurrentProposals field and
AnalysisResult CRD for RemediationOption CEL validation. Set default
concurrency to 5 in the deploy script's ApprovalPolicy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…AgenticConfig CRD

Replace the --enable-agentic CLI flag with a LightspeedAgents entry in
OLSConfig.spec.featureGates. The ProposalReconciler is always registered
but gates itself at reconcile time via an Enabled callback.

Rename ApprovalPolicy CRD to AgenticConfig — a cluster-scoped singleton
that consolidates approval policy, concurrency limits, console plugin
image, and sandbox pod config. Remove --base-template-name and
--agentic-console-image CLI flags; console image now comes from
AgenticConfig.spec.console.image, base template name is hardcoded as a
constant.

Requires companion PR in lightspeed-agentic-operator for the CRD type
changes and controller updates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Console and sandbox images are now required fields with format
validation. Approval stages have min/max items constraints.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant