Add base-model completion pipeline behind a default-off flag#495
Conversation
First step of moving the Open Source (llama) path off instruction-tuned models onto base-model continuation. Adds BaseCompletionPromptRenderer: no task preamble, no standalone labels, persona/style/context folded into a conditioning preface, and the caret prefix emitted last with trailing whitespace trimmed. SuggestionRequestFactory routes to it when the new useBaseCompletionPipeline flag is on and the Open Source engine is selected; otherwise the existing LlamaPromptRenderer is used unchanged. The flag defaults off (read from the cotabbyBaseCompletionPipelineEnabled default, no UI), so default behavior is byte-for-byte unchanged. The base path reuses the already-merged generation-time constraints (single-line masking, mid-word continuation). Regenerated Cotabby.xcodeproj for the two new files.
| var sentence = "The following is text" | ||
| if !name.isEmpty { | ||
| sentence += " written by \(name)" | ||
| } | ||
| if !rules.isEmpty { | ||
| sentence += " in a \(rules.joined(separator: ", ")) style" | ||
| } | ||
| sentence += "." | ||
| if !language.isEmpty { | ||
| // `languageInstruction` is already a soft directive sentence; append it verbatim. | ||
| sentence += " \(language)" | ||
| } | ||
| return sentence |
There was a problem hiding this comment.
Language-only framing produces a vacuous sentence. When
userName and customRules are both empty but languageInstruction is supplied, the sentence becomes "The following is text. Write in English." — the first clause carries no conditioning signal and the imperative "Write in English." looks out of place in what is meant to read as a document description. Either skip the wrapper clause entirely for the language-only case, or return language on its own.
| var sentence = "The following is text" | |
| if !name.isEmpty { | |
| sentence += " written by \(name)" | |
| } | |
| if !rules.isEmpty { | |
| sentence += " in a \(rules.joined(separator: ", ")) style" | |
| } | |
| sentence += "." | |
| if !language.isEmpty { | |
| // `languageInstruction` is already a soft directive sentence; append it verbatim. | |
| sentence += " \(language)" | |
| } | |
| return sentence | |
| // If we only have a language directive and no other framing, return it directly — wrapping | |
| // it in "The following is text." adds nothing and makes the preface read oddly. | |
| if name.isEmpty && rules.isEmpty { | |
| return language.isEmpty ? nil : language | |
| } | |
| var sentence = "The following is text" | |
| if !name.isEmpty { | |
| sentence += " written by \(name)" | |
| } | |
| if !rules.isEmpty { | |
| sentence += " in a \(rules.joined(separator: ", ")) style" | |
| } | |
| sentence += "." | |
| if !language.isEmpty { | |
| // `languageInstruction` is already a soft directive sentence; append it verbatim. | |
| sentence += " \(language)" | |
| } | |
| return sentence |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| func test_trailingWhitespaceTrimmedButMidWordPreserved() { | ||
| XCTAssertEqual( | ||
| BaseCompletionPromptRenderer.prompt(prefixText: "doing my aft", applicationName: "X", userName: nil), | ||
| "doing my aft" | ||
| ) | ||
| XCTAssertEqual( | ||
| BaseCompletionPromptRenderer.prompt(prefixText: "see you \n", applicationName: "X", userName: nil), | ||
| "see you" | ||
| ) | ||
| } |
There was a problem hiding this comment.
Missing test for the language-only framing path. The seven existing tests cover
userName, customRules, and combinations, but there is no case where languageInstruction is non-nil/non-empty while both userName is nil and customRules is empty. That path hits authorFraming with a non-empty language but empty name and rules, producing "The following is text. Write in English." — a distinct output shape that would catch the fix suggested in the renderer.
| persistSelectedWordCountPreset(resolvedWordCountPreset) | ||
| persistClipboardContextEnabled(resolvedClipboardContextEnabled) | ||
| persistFastModeEnabled(resolvedFastModeEnabled) | ||
| userDefaults.set(resolvedBaseCompletionPipelineEnabled, forKey: Self.baseCompletionPipelineEnabledDefaultsKey) |
There was a problem hiding this comment.
Persistence for the new flag is written inline in
persistSettings() while every other setting in this function delegates to a dedicated persist…() helper (e.g. persistFastModeEnabled). Keeping the same pattern makes the function easier to scan and the flag easier to move to a dedicated resetToDefaults-style call later.
| userDefaults.set(resolvedBaseCompletionPipelineEnabled, forKey: Self.baseCompletionPipelineEnabledDefaultsKey) | |
| persistBaseCompletionPipelineEnabled(resolvedBaseCompletionPipelineEnabled) |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Summary
First step of transitioning the Open Source (llama) path off instruction-tuned models onto base-model continuation. Instruct checkpoints fed Cotabby's instruction-blob prompt tend to leak assistant-style replies and echo scaffolding; a base model continues the user's text natively. This lands the base-model prompt machinery behind a default-off flag so it can be validated against a base model before the default is flipped.
When
useBaseCompletionPipelineis on and the Open Source engine is selected,SuggestionRequestFactoryrenders via the newBaseCompletionPromptRenderer: no task preamble, no standaloneLabel:lines, persona/style/language/context folded into a short conditioning preface (a base model conditions on description, it does not obey instructions), and the caret prefix emitted last with trailing whitespace trimmed so generation starts at a clean word boundary. The path reuses the already-merged generation-time constraints (single-line masking, mid-word continuation).Validation
New
BaseCompletionPromptRendererTestsadds 7 pure-function cases (prefix-last invariant, no-preamble/no-labels, trailing-trim with mid-word preserved, persona conditioning, context-only-when-supplied) and compiles under the test target.Local test execution is blocked by a Team ID / code-signing mismatch loading the app-hosted
CotabbyTests.xctestbundle into the signed host (bothCODE_SIGNING_ALLOWED=NOand normal signing hit it). Per the repo's testing note, reporting it and relying onbuild-for-testing; CI runs the suite with proper signing.Linked issues
None — experimental, flag-gated.
Risk / rollout notes
useBaseCompletionPipelinedefaultsfalse. With it off,SuggestionRequestFactoryuses the existingLlamaPromptRendererand behavior is byte-for-byte unchanged. The new renderer and its tests are inert until the flag is set (defaults write <bundle id> cotabbyBaseCompletionPipelineEnabled -bool YES).xcodegen generate(additive 8-line diff, no unrelated churn).cotabbyinferenceneeds nothing new for this.llm-io.jsonlprefixes, then flip the default.🤖 Generated with Claude Code
Greptile Summary
This PR introduces a
BaseCompletionPromptRendererbehind a default-offuseBaseCompletionPipelineflag, routing the Open Source (llama) engine through a bare-continuation prompt instead of the existing instruction-blob format when the flag is enabled.BaseCompletionPromptRenderer) omits all instruction scaffolding, folds persona/style/language into a conditioning preface, and guaranteesprefixTextis the final bytes of the prompt with trailing whitespace trimmed.SuggestionSettingsModel→SuggestionSettingsSnapshot→SuggestionRequestFactory, and wires it into theCombineLatest4reactive chain so snapshot emissions reflect live flag state.cotabbyBaseCompletionPipelineEnableddefault.Confidence Score: 4/5
Safe to merge — the new renderer is completely inert by default and the existing Llama path is untouched.
The change is well-scoped behind a default-off flag and the existing path is provably unchanged. The only rough edges are in the new renderer itself: the language-only framing sentence is weak conditioning, a test case for that path is missing, and the persistence helper breaks the established pattern in the settings model. None of these affect users until the flag is deliberately enabled.
BaseCompletionPromptRenderer.swift — the authorFraming language-only case; SuggestionSettingsModel.swift — inline persistence vs dedicated helper pattern.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[SuggestionRequestFactory.buildRequest] --> B{useBaseCompletionPipeline\nAND selectedEngine == .llamaOpenSource?} B -- Yes --> C[BaseCompletionPromptRenderer.prompt] B -- No --> D[LlamaPromptRenderer.prompt] C --> E1[authorFraming\npersona / style / language] C --> E2[extendedContext\nnotes preface] C --> E3[visualContextSummary\nnearby on screen] C --> E4[clipboardContext\non the clipboard] E1 --> F[preface joined with newlines] E2 --> F E3 --> F E4 --> F F --> G[preface + blank line + trimmedPrefix] G --> H[SuggestionRequest] D --> HReviews (1): Last reviewed commit: "Add base-model completion pipeline behin..." | Re-trigger Greptile