Problem
Once MCP_READ_ONLY is flipped to allow writes, ai-toolkit has no further constraint on what an agent puts into Memgraph. The current write-safety surface is a single regex (is_write_query) + the MCP_READ_ONLY boolean in integrations/mcp-memgraph/src/mcp_memgraph/servers/server.py. That's binary allow/deny on the verb-set.
In the agentic Knowledge-Graph-Augmented Generation (KGAG) case — autonomous agents writing to a graph-backed memory across many sessions, often multiple agents sharing a graph — two failure modes dominate:
- Schema drift. Agents hallucinate labels and property names per call. The same concept ends up as
:Person, :person, :User, :User_profile. Required properties get missed. Issue #127 ("every relationship in KG extraction is :DIRECTED") is one symptom of this class.
- Absent provenance. Writes carry no
source, no extraction_method, no confidence. Downstream consumers can't distinguish a parsed API response from an LLM-extracted claim from an outright hallucination.
These compound. A graph of 10k nodes from 50 agent sessions becomes unqueryable in weeks — not because the data is wrong, but because nothing constrains how it was written. See Context Rot for the decay pattern this produces.
The toolkit already has schema introspection (get_schema, get_constraint, get_index). The missing piece is schema + provenance enforcement at write time.
Proposal
Add a Write Gate server variant to ai-toolkit's MCP plugin registry. Two invariants, enforced together:
- Schema validation — every write is matched against a registered schema before MERGE
- Computed provenance — every write stamps
source, extraction_method, confidence (computed, not declared), and a gate version marker
The gate is neutral machinery. Consumers decide what their schema contains and what their confidence formula is; the gate enforces the shape and computes the outputs.
Why a new server variant, not a flag on the existing server?
Valid question. The argument for a flag: simpler, smaller surface. The argument for a variant:
- Tool surface isolation. The default server exposes
run_query — a Cypher pass-through. The Write Gate exposes typed tools (write_node, write_relationship) that don't allow raw Cypher writes. Agents using the gate should not have run_query in their tool list; that defeats the enforcement. Separate servers let the operator hand out the right tool surface per agent.
- Independent configuration. Schema registry source, formula provider, error behavior — these are gate-specific config. Bolting them onto the existing server's env var surface crowds it.
- Backwards compatibility. Existing users running the current server see no change. The gate is opt-in via
AVAILABLE_SERVERS.
The plugin registry explicitly anticipates this. In servers/__init__.py:
AVAILABLE_SERVERS: Dict[str, Dict[str, Any]] = {
"server": { ... },
"memgraph-experimental": { ... },
# Future servers can be added here:
# "hygm": {
# "module": "mcp_memgraph.servers.hygm",
# ...
# },
}
The Write Gate registers as one more entry in that dict.
Interface spec
Three MCP tools, following the toolkit's snake_case verb-object convention.
write_node
Input:
label: str # Target node label
merge_keys: dict[str, str|int|float|bool] # Identity for MERGE (scalars only)
properties: dict[str, Any] # Data (no protected fields allowed)
source: str # Provenance: where the claim came from
extraction_method: str # Must be in FormulaProvider.allowed_extraction_methods()
reliability: float = 0.5 # Clamped to [0.0, 1.0] before formula
Output (success):
{
"status": "written",
"label": str, # Canonical (may differ if remapped)
"merge_keys": dict,
"confidence": float, # In [0.0, 1.0]
"write_gate_version": str, # Semver: "MAJOR.MINOR.PATCH"
"remapped_from": str | null # Present iff schema remap occurred
}
Output (error):
{
"status": "rejected",
"error_code": str, # See Error Codes table below
"message": str,
"details": dict # Optional diagnostic info
}
Emitted Cypher (reference implementation):
MERGE (n:CanonicalLabel {merge_key1: $v1, merge_key2: $v2})
SET n += $properties,
n.confidence = $computed_confidence,
n.source = $source,
n.extraction_method = $extraction_method,
n.write_gate_version = $gate_version,
n.last_updated = datetime()
// If remapped:
SET n._schema_remap_from = $original_label
write_relationship
Input:
type: str # Relationship type (validated against schema)
from_label: str
from_keys: dict[str, str|int|float|bool] # Scalars only
to_label: str
to_keys: dict[str, str|int|float|bool] # Scalars only
properties: dict[str, Any] = {} # No protected fields
source: str
extraction_method: str # Must be in FormulaProvider.allowed_extraction_methods()
reliability: float = 0.5 # Clamped to [0.0, 1.0] before formula
endpoint_policy: str = "fail_if_missing" # or "merge_endpoints"
Output: { "status": "written" | "rejected", ... }
Endpoint resolution policy (addresses the :DIRECTED everywhere pattern from #127):
fail_if_missing (default): If either endpoint node doesn't exist, reject with ENDPOINT_NOT_FOUND. Agent must write the node first. Prevents silent creation of stub nodes.
merge_endpoints: If either endpoint is missing, MERGE it by its keys with _stub=true flag. Opt-in for ingestion-style workloads that have ordering constraints.
refresh_schema_cache
No arguments. Reloads the registered schema from its source without restarting the server. Returns count of labels loaded.
Schema registry interface
Pluggable. The gate ships one reference implementation (graph-backed :Schema nodes); consumers can provide others by implementing:
from dataclasses import dataclass, field
from typing import Protocol
@dataclass(frozen=True)
class SchemaEntry:
label: str # Canonical label
required_properties: list[str] = field(default_factory=list)
remaps_from: list[str] = field(default_factory=list)
class SchemaRegistry(Protocol):
def get_entry(self, label: str) -> SchemaEntry | None: ...
def all_canonical_labels(self) -> list[str]: ...
def remap_target(self, label: str) -> str | None: ...
def fallback_label(self) -> str | None: ... # Used when policy=remap and no match
Refresh semantics: refresh_schema_cache loads a new snapshot then swaps the in-memory pointer atomically. In-flight writes complete against the pre-refresh snapshot.
Unknown-label policy: controlled by WRITE_GATE_UNKNOWN_LABEL_POLICY env var, values remap (default) or reject.
Graph-backed reference implementation reads:
MATCH (s:Schema {allowed: true})
RETURN s.name AS label,
s.required_properties AS required_properties,
s.absorbs AS remaps_from
Confidence formula provider
Also pluggable. The formula provider is responsible for two things: computing confidence and declaring which extraction_method values it accepts.
class FormulaProvider(Protocol):
def allowed_extraction_methods(self) -> set[str]: ...
def compute(self, reliability: float, extraction_method: str) -> float: ...
The gate ships a deliberately simple default so the primitive doesn't dictate ideology:
class DefaultFormulaProvider:
WEIGHTS = {
"api": 1.0, # Live API / CLI / deterministic machine output
"parsed": 0.85, # Structured doc (YAML, JSON, HCL, Markdown frontmatter)
"manual": 0.75, # Human explicitly stated — capped below verified sources
"llm": 0.60, # LLM-extracted from unstructured text
}
def allowed_extraction_methods(self) -> set[str]:
return set(self.WEIGHTS.keys())
def compute(self, reliability: float, extraction_method: str) -> float:
return reliability * self.WEIGHTS[extraction_method]
Gate-enforced contract around the provider:
reliability is clamped to [0.0, 1.0] before the provider is called
extraction_method must be in provider.allowed_extraction_methods() or the gate rejects with INVALID_EXTRACTION_METHOD
- The provider's returned value must be in
[0.0, 1.0]; out-of-range returns reject with FORMULA_INVALID_OUTPUT
Consumers who want richer models plug in their own provider. See "Appendix: Example formula providers" for patterns other implementations have used.
Protected fields — the gate refuses writes where properties contains any of: confidence, write_gate_version, source, extraction_method, last_updated. These are gate-computed; agents declare inputs, gate produces outputs.
Error codes
| Code |
Meaning |
SCHEMA_UNKNOWN_LABEL |
Label not in registry and no remap target (reject-mode only) |
SCHEMA_MISSING_REQUIRED_PROPERTY |
Required property absent from properties + merge_keys |
SCHEMA_PROTECTED_FIELD |
Agent attempted to set a gate-computed field |
SCHEMA_SOURCE_UNAVAILABLE |
Registry load failed (graph unreachable, file missing, etc.) |
SCHEMA_TYPE_MISMATCH |
Property type doesn't match declared schema type |
ENDPOINT_NOT_FOUND |
write_relationship called with fail_if_missing and endpoint absent |
INVALID_EXTRACTION_METHOD |
Value not in FormulaProvider.allowed_extraction_methods() |
FORMULA_INVALID_OUTPUT |
Provider returned a value outside [0.0, 1.0] |
Acceptance criteria
v1 is complete when:
Test matrix
Minimum set, covering the invariants:
| # |
Input |
Expected |
| 1 |
write_node(label="Person", merge_keys={"name":"Alice"}, properties={"age": 30}, source="test", extraction_method="manual", reliability=0.9) |
status=written, confidence=0.675 (0.9 × 0.75), write_gate_version set |
| 2 |
Same, with properties={"confidence": 1.0} |
SCHEMA_PROTECTED_FIELD |
| 3 |
write_node(label=":person", ...) with schema registering :Person as canonical for :person |
status=written, label="Person", remapped_from=":person", _schema_remap_from set on node |
| 4 |
write_node(label="ZZZNonexistent", ...) in remap-mode |
status=written, remapped to fallback (configurable); _schema_remap_from=":ZZZNonexistent" |
| 5 |
Same, in reject-mode |
SCHEMA_UNKNOWN_LABEL |
| 6 |
write_node(label="Person", merge_keys={}, properties={}) where schema requires name |
SCHEMA_MISSING_REQUIRED_PROPERTY, details lists name |
| 7 |
write_relationship(type="DIRECTED", ...) where :DIRECTED is not a registered type |
SCHEMA_UNKNOWN_LABEL (regression for #127) |
| 8 |
write_relationship with missing endpoint, endpoint_policy=fail_if_missing |
ENDPOINT_NOT_FOUND |
| 9 |
Same, with endpoint_policy=merge_endpoints |
status=written, endpoint created with _stub=true |
| 10 |
refresh_schema_cache() after adding :Event to registry |
{"loaded": <N+1>}, subsequent write_node(label="Event", ...) succeeds |
Follow-up (out of scope for v1)
Conflict detection (comparing existing vs incoming confidence to flag silent overwrites) and dedup / fuzzy entity resolution are the natural Phase 2 and Phase 3 additions; I'm willing to author both as follow-up issues once v1 lands. Cross-label entity identity, temporal decay in the comparator, and a MAGE in-process deployment variant are further-out options that should be discussed if there's community pull.
Related art and standards
- W3C PROV-O: The PROV Ontology — the canonical vocabulary for provenance on the web. Recommended as the reference point for any provenance-field naming choices an implementation makes.
- Cognee — application-layer AI memory with knowledge-engine self-improvement. Complementary: a Memgraph-native Write Gate is the missing substrate for patterns like Cognee's.
- Issue #127 — symptom of schema drift; the Write Gate prevents the class.
Appendix: Example formula providers (informational)
The v1 default is intentionally simple. Domain-specific grading systems that slot in as alternative FormulaProvider implementations without changing the gate's interface:
- Admiralty Code — NATO AJP-2.1 two-dimensional grading (source reliability A-F × information credibility 1-6), widely used in Cyber Threat Intelligence
- Flat declaration — agent declares a 0-1 float, no transformation; simplest option for trusted agent pipelines
- STIX 2.1 Confidence Scales — OASIS-standardized confidence values with mappings across several qualitative scales (DNI, Admiralty, WEP)
Reference implementation
I have a working implementation of this pattern I'll extract the v1 subset from. Happy to share offline with maintainers if useful during review.
Problem
Once
MCP_READ_ONLYis flipped to allow writes,ai-toolkithas no further constraint on what an agent puts into Memgraph. The current write-safety surface is a single regex (is_write_query) + theMCP_READ_ONLYboolean inintegrations/mcp-memgraph/src/mcp_memgraph/servers/server.py. That's binary allow/deny on the verb-set.In the agentic Knowledge-Graph-Augmented Generation (KGAG) case — autonomous agents writing to a graph-backed memory across many sessions, often multiple agents sharing a graph — two failure modes dominate:
:Person,:person,:User,:User_profile. Required properties get missed. Issue #127 ("every relationship in KG extraction is:DIRECTED") is one symptom of this class.source, noextraction_method, noconfidence. Downstream consumers can't distinguish a parsed API response from an LLM-extracted claim from an outright hallucination.These compound. A graph of 10k nodes from 50 agent sessions becomes unqueryable in weeks — not because the data is wrong, but because nothing constrains how it was written. See Context Rot for the decay pattern this produces.
The toolkit already has schema introspection (
get_schema,get_constraint,get_index). The missing piece is schema + provenance enforcement at write time.Proposal
Add a Write Gate server variant to
ai-toolkit's MCP plugin registry. Two invariants, enforced together:source,extraction_method,confidence(computed, not declared), and a gate version markerThe gate is neutral machinery. Consumers decide what their schema contains and what their confidence formula is; the gate enforces the shape and computes the outputs.
Why a new server variant, not a flag on the existing server?
Valid question. The argument for a flag: simpler, smaller surface. The argument for a variant:
run_query— a Cypher pass-through. The Write Gate exposes typed tools (write_node,write_relationship) that don't allow raw Cypher writes. Agents using the gate should not haverun_queryin their tool list; that defeats the enforcement. Separate servers let the operator hand out the right tool surface per agent.AVAILABLE_SERVERS.The plugin registry explicitly anticipates this. In
servers/__init__.py:The Write Gate registers as one more entry in that dict.
Interface spec
Three MCP tools, following the toolkit's snake_case verb-object convention.
write_nodeEmitted Cypher (reference implementation):
write_relationshipEndpoint resolution policy (addresses the
:DIRECTEDeverywhere pattern from #127):fail_if_missing(default): If either endpoint node doesn't exist, reject withENDPOINT_NOT_FOUND. Agent must write the node first. Prevents silent creation of stub nodes.merge_endpoints: If either endpoint is missing, MERGE it by its keys with_stub=trueflag. Opt-in for ingestion-style workloads that have ordering constraints.refresh_schema_cacheNo arguments. Reloads the registered schema from its source without restarting the server. Returns count of labels loaded.
Schema registry interface
Pluggable. The gate ships one reference implementation (graph-backed
:Schemanodes); consumers can provide others by implementing:Refresh semantics:
refresh_schema_cacheloads a new snapshot then swaps the in-memory pointer atomically. In-flight writes complete against the pre-refresh snapshot.Unknown-label policy: controlled by
WRITE_GATE_UNKNOWN_LABEL_POLICYenv var, valuesremap(default) orreject.Graph-backed reference implementation reads:
Confidence formula provider
Also pluggable. The formula provider is responsible for two things: computing confidence and declaring which
extraction_methodvalues it accepts.The gate ships a deliberately simple default so the primitive doesn't dictate ideology:
Gate-enforced contract around the provider:
reliabilityis clamped to[0.0, 1.0]before the provider is calledextraction_methodmust be inprovider.allowed_extraction_methods()or the gate rejects withINVALID_EXTRACTION_METHOD[0.0, 1.0]; out-of-range returns reject withFORMULA_INVALID_OUTPUTConsumers who want richer models plug in their own provider. See "Appendix: Example formula providers" for patterns other implementations have used.
Protected fields — the gate refuses writes where
propertiescontains any of:confidence,write_gate_version,source,extraction_method,last_updated. These are gate-computed; agents declare inputs, gate produces outputs.Error codes
SCHEMA_UNKNOWN_LABELSCHEMA_MISSING_REQUIRED_PROPERTYproperties+merge_keysSCHEMA_PROTECTED_FIELDSCHEMA_SOURCE_UNAVAILABLESCHEMA_TYPE_MISMATCHENDPOINT_NOT_FOUNDwrite_relationshipcalled withfail_if_missingand endpoint absentINVALID_EXTRACTION_METHODFormulaProvider.allowed_extraction_methods()FORMULA_INVALID_OUTPUT[0.0, 1.0]Acceptance criteria
v1 is complete when:
AVAILABLE_SERVERSpointing toservers/write_gate.pywrite_node,write_relationship,refresh_schema_cachetools registered_schema_remap_frombreadcrumb written on remap (remap-mode is default viaWRITE_GATE_UNKNOWN_LABEL_POLICY)endpoint_policy=fail_if_missingis the default forwrite_relationship(covered by test row Fix MCP config. #8)Test matrix
Minimum set, covering the invariants:
write_node(label="Person", merge_keys={"name":"Alice"}, properties={"age": 30}, source="test", extraction_method="manual", reliability=0.9)status=written,confidence=0.675(0.9 × 0.75),write_gate_versionsetproperties={"confidence": 1.0}SCHEMA_PROTECTED_FIELDwrite_node(label=":person", ...)with schema registering:Personas canonical for:personstatus=written,label="Person",remapped_from=":person",_schema_remap_fromset on nodewrite_node(label="ZZZNonexistent", ...)in remap-modestatus=written, remapped to fallback (configurable);_schema_remap_from=":ZZZNonexistent"SCHEMA_UNKNOWN_LABELwrite_node(label="Person", merge_keys={}, properties={})where schema requiresnameSCHEMA_MISSING_REQUIRED_PROPERTY, details listsnamewrite_relationship(type="DIRECTED", ...)where:DIRECTEDis not a registered typeSCHEMA_UNKNOWN_LABEL(regression for #127)write_relationshipwith missing endpoint,endpoint_policy=fail_if_missingENDPOINT_NOT_FOUNDendpoint_policy=merge_endpointsstatus=written, endpoint created with_stub=truerefresh_schema_cache()after adding:Eventto registry{"loaded": <N+1>}, subsequentwrite_node(label="Event", ...)succeedsFollow-up (out of scope for v1)
Conflict detection (comparing existing vs incoming confidence to flag silent overwrites) and dedup / fuzzy entity resolution are the natural Phase 2 and Phase 3 additions; I'm willing to author both as follow-up issues once v1 lands. Cross-label entity identity, temporal decay in the comparator, and a MAGE in-process deployment variant are further-out options that should be discussed if there's community pull.
Related art and standards
Appendix: Example formula providers (informational)
The v1 default is intentionally simple. Domain-specific grading systems that slot in as alternative
FormulaProviderimplementations without changing the gate's interface:Reference implementation
I have a working implementation of this pattern I'll extract the v1 subset from. Happy to share offline with maintainers if useful during review.