Add /selftest.extension core extension to test other extensions#1758
Add /selftest.extension core extension to test other extensions#1758dhilipkumars wants to merge 17 commits intogithub:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an experimental sandbox under tests/extension-commands/ intended to evaluate whether LLM/agent workflows can discover extension command definitions and “execute” their mapped markdown instructions.
Changes:
- Introduces a mock Python entrypoint (
main.py) that prints timestamped outputs for--lintand--deploy. - Adds a
.specify/sandbox containing command markdown files (lint.md,deploy.md) and anextensions.ymlmapping. - Adds
TESTING.mdwith a copy/paste prompt for running the exercise in an LLM chat.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/extension-commands/main.py | Mock CLI script for lint/deploy command execution output |
| tests/extension-commands/TESTING.md | Manual LLM evaluation instructions and expected terminal-style output |
| tests/extension-commands/.specify/extensions.yml | Declares command mappings for the sandbox |
| tests/extension-commands/.specify/lint.md | Markdown “command file” instructing how to run the mock linter |
| tests/extension-commands/.specify/deploy.md | Markdown “command file” instructing how to run the mock deploy |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@copilot code review[agent] |
|
Wild idea, maybe as an extension with its own command /selftest "extension" ? |
|
Interesting. Where do you think such a |
|
Probably in the core repo under extensions/selftest ? |
|
perfect let me re-work my PR |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ructure and argparse mutually exclusive groups
…ual tests sandbox
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
mnriem
left a comment
There was a problem hiding this comment.
Can you address the Copilot feedback where applicable. If not applicable, please explain why.
@mnriem sure, on it. Have made some changes to address your previous comment was testing them locally last night. i should be get them changes in tonight or tomorrow. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
yes we do have that already as apart from that i have addressed all copilots review comments and updated |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@dhilipkumars We probably should remove it from catalog.json then? And rely on the --dev mechanism for the self-test extension? Thoughts? |
Happy to move it out if you think that's the right thing to do, but i feel this is a good candidate for core catalog, if not this what are the extensions you have in mind that would go in core catalog? |
|
OK you are right. Lets keep it in the catalog |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@mnriem looks like this is ready to be merged. |
This PR adds an experimental testing sandbox under
tests/extension-commandsto evaluate how LLMs/Agents parse extension specifications and execute their mapped commands.Sample Output from Gemini LLM
In my Gemini CLI if i entered below prompt
It outputs the following.
AI Usage disclosure: Used Aniti-gravity to build this out.
Future Goal: Currently its manual to trigger these tests but at some point we can build a workflow for co-pilot to run these
/selftests