Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
105 changes: 103 additions & 2 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,101 @@ These seams are documented as comments in the relevant `.ttl` files.

---

## Interoperability: Flexo MMS and OpenMBEE

Because knowledgecomplex stores all data as RDF and enforces constraints via standard W3C technologies (OWL, SHACL, SPARQL), it is natively compatible with [Flexo MMS](https://github.com/Open-MBEE/flexo-mms-deployment) — the Model Management System developed by the [OpenMBEE](https://www.openmbee.org/) community.

### Why the fit is natural

Flexo MMS is a version-controlled model repository that speaks RDF natively. A KC instance graph is already a valid RDF dataset, so the integration path is direct:

| KC concept | MMS equivalent | Notes |
|---|---|---|
| `kc:Complex` (instance graph) | MMS model/branch | A KC export is a self-contained RDF graph that can be committed as an MMS model revision |
| `kc:boundedBy`, `kc:hasElement` | MMS element relationships | Topological structure is expressed as standard RDF triples |
| SHACL shapes (`kc_core_shapes.ttl` + user shapes) | MMS validation profiles | Shapes can be registered in MMS to enforce KC constraints on committed models |
| `kc:uri` | MMS element cross-references | Provides traceability from KC elements to external artifacts (files, documents, URIs) |
| JSON-LD export (`dump_graph(format="json-ld")`) | MMS ingest format | JSON-LD is the primary API format for Flexo MMS |

### Integration patterns

**Push to MMS:** Export a KC instance via `kc.export()` or `dump_graph(format="json-ld")`, then commit to a Flexo MMS repository via its REST API. The OWL ontology and SHACL shapes can be committed alongside the instance data, enabling MMS-side validation.

**Pull from MMS:** Retrieve a model revision as JSON-LD from Flexo MMS, then load it into a KC instance via `load_graph(kc, "model.jsonld")`. The KC's SHACL verification (`kc.verify()`) ensures the imported data satisfies all topological and ontological constraints.

**Version control:** MMS provides branching, diffing, and merge capabilities at the RDF triple level. KC's `ComplexDiff` and `ComplexSequence` classes complement this by providing simplicial-complex-aware diffing (element-level adds/removes rather than triple-level changes).

### What KC adds beyond MMS

Flexo MMS manages RDF models generically — it stores, versions, and queries them but does not enforce simplicial complex structure. KC adds the topological layer: boundary-closure, closed-triangle constraints, typed simplicial hierarchy, and algebraic topology computations (Betti numbers, Hodge decomposition). Together, MMS provides the model management infrastructure and KC provides the mathematical structure.

### Reference

OpenMBEE (Open Model-Based Engineering Environment) is an open-source community developing tools for model-based systems engineering. Flexo MMS is its core model management system. See [openmbee.org](https://www.openmbee.org/) and [github.com/Open-MBEE](https://github.com/Open-MBEE).

---

## Deployment Architecture

The internal design described above (2x2 map, component layers, static resources) is the library's foundation. In practice, a knowledge complex is deployed through a stack of five layers, each building on the one below:

```
┌─────────────────────────────────────────────────────────────┐
│ 5. LLM Tool Integration │
│ Register KC operations as callable tools for a language │
│ model. The complex serves as a deterministic expert │
│ system — the LLM navigates, queries, and analyzes via │
│ tool calls; the KC guarantees topological correctness │
│ and returns structured, verifiable results. │
├─────────────────────────────────────────────────────────────┤
│ 4. MCP Server │
│ Model Context Protocol server exposing KC as tools for │
│ AI assistants (Claude, etc.). Each KC operation becomes │
│ a tool: add_vertex, boundary, betti_numbers, audit, etc. │
├─────────────────────────────────────────────────────────────┤
│ 3. Microservice (REST API) │
│ Python-hosted service exposing KC operations over HTTP. │
│ CRUD for elements, SPARQL query execution, SHACL │
│ verification, algebraic topology analysis, export/import.│
├─────────────────────────────────────────────────────────────┤
│ 2. Concrete Knowledge Complex │
│ An instance using a specific ontology. Typed vertices, │
│ edges, and faces with attributes. SHACL-verified on │
│ every write. Serialized as RDF (Turtle, JSON-LD). │
│ Versioned via Flexo MMS or git. │
├─────────────────────────────────────────────────────────────┤
│ 1. KC-Compatible Ontology │
│ OWL class hierarchy extending kc:Vertex/Edge/Face. │
│ SHACL shapes for attribute constraints. Publicly hosted │
│ at persistent URIs (w3id.org). Dereferenceable — tools │
│ can fetch the ontology and understand the type system. │
└─────────────────────────────────────────────────────────────┘
```

### Layer 1: Ontology

A KC-compatible ontology is an OWL ontology whose classes extend `kc:Vertex`, `kc:Edge`, and `kc:Face`, paired with SHACL shapes for instance-level constraints. Ontologies are authored via `SchemaBuilder` and exported as standard `.ttl` files. For public use, the ontology should be hosted at a persistent URI (e.g. `https://w3id.org/kc/`) so that other systems can dereference the IRI and retrieve the OWL/SHACL definitions. The `knowledgecomplex.ontologies` package ships three reference ontologies (operations, brand, research) as starting points.

### Layer 2: Concrete Complex

A concrete knowledge complex is an RDF instance graph conforming to a specific ontology. It contains typed elements (vertices, edges, faces) with attributes, linked by `kc:boundedBy` and collected by `kc:hasElement`. SHACL verification enforces topological and ontological constraints on every write. The complex is serializable to Turtle, JSON-LD, or N-Triples and can be versioned via Flexo MMS or committed to a git repository as `.ttl` files.

### Layer 3: Microservice

A Python-hosted HTTP service wraps the `KnowledgeComplex` API in a REST interface. Typical endpoints: element CRUD, named SPARQL queries, topological operations (boundary, star, closure), algebraic topology analysis (Betti numbers, Hodge decomposition, edge PageRank), SHACL verification and audit, and schema introspection. The service loads a schema at startup and manages one or more complex instances.

### Layer 4: MCP Server

A [Model Context Protocol](https://modelcontextprotocol.io/) server exposes KC operations as tools that AI assistants can call. Each KC method becomes an MCP tool: `add_vertex`, `boundary`, `find_cliques`, `betti_numbers`, `audit`, etc. The MCP server is a thin adapter over the microservice or the library directly, translating between MCP tool calls and KC Python API calls.

### Layer 5: LLM Tool Integration

The knowledge complex is registered as a set of callable tools for a language model. The LLM uses the complex as a **deterministic expert system** — it navigates the simplicial structure, retrieves typed elements and their attributes, runs topological queries, and performs algebraic topology analysis via tool calls. The KC guarantees that every result is topologically valid and SHACL-verified. The LLM provides natural language understanding and reasoning; the KC provides structured, auditable, mathematically rigorous retrieval.

This separation is key: the LLM handles ambiguity, intent, and synthesis; the KC handles structure, correctness, and computation. Neither replaces the other.

---

## Namespace Conventions

```turtle
Expand All @@ -162,5 +257,11 @@ User namespaces are set via `SchemaBuilder(namespace="aaa")`. The URI base `http
| `knowledgecomplex/schema.py` | Python API — schema authoring | `SchemaBuilder` DSL: `add_*_type`, `dump_owl`, `dump_shacl`, `export`, `load` |
| `knowledgecomplex/graph.py` | Python API — instance I/O | `KnowledgeComplex`: `add_vertex`, `add_edge`, `add_face`, `query`, `dump_graph`, `export`, `load` |
| `knowledgecomplex/exceptions.py` | Public exceptions | `ValidationError`, `SchemaError`, `UnknownQueryError` |
| `knowledgecomplex/queries/vertices.sparql` | Framework SPARQL | Return all vertices and their types |
| `knowledgecomplex/queries/coboundary.sparql` | Framework SPARQL | Inverse boundary operator |
| `knowledgecomplex/io.py` | Python API — serialization | `save_graph`, `load_graph`, `dump_graph` — multi-format file I/O (Turtle, JSON-LD, N-Triples) |
| `knowledgecomplex/viz.py` | Python API — visualization | Hasse diagrams (`plot_hasse`), geometric realization (`plot_geometric`), `to_networkx`, `verify_networkx` |
| `knowledgecomplex/analysis.py` | Python API — algebraic topology | `betti_numbers`, `euler_characteristic`, `hodge_laplacian`, `edge_pagerank` (optional: numpy, scipy) |
| `knowledgecomplex/clique.py` | Python API — clique inference | `find_cliques`, `infer_faces`, `fill_cliques` — flagification and typed face inference |
| `knowledgecomplex/filtration.py` | Python API — filtrations | `Filtration` — nested subcomplex sequences, birth tracking, `from_function` |
| `knowledgecomplex/diff.py` | Python API — diffs and sequences | `ComplexDiff`, `ComplexSequence` — time-varying complexes with SPARQL UPDATE export/import |
| `knowledgecomplex/codecs/markdown.py` | Codec — markdown files | `MarkdownCodec` — YAML frontmatter + section-based round-trip; `verify_documents` |
| `knowledgecomplex/queries/*.sparql` | Framework SPARQL | 7 templates: vertices, coboundary, boundary, star, closure, skeleton, degree |
136 changes: 123 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,20 +38,27 @@ sb.add_vertex_type("activity", attributes={"name": text()})
sb.add_vertex_type("resource", attributes={"name": text()})
sb.add_edge_type("performs", attributes={"role": vocab("lead", "support")})
sb.add_edge_type("requires", attributes={"mode": vocab("read", "write")})
sb.add_edge_type("produces", attributes={"mode": vocab("read", "write")})
sb.add_edge_type("accesses", attributes={"mode": vocab("read", "write")})
sb.add_edge_type("responsible", attributes={"level": vocab("owner", "steward")})
sb.add_face_type("operation")
sb.add_face_type("production")

# 2. Build an instance
kc = KnowledgeComplex(schema=sb)
kc.add_vertex("alice", type="actor", name="Alice")
kc.add_vertex("etl-run", type="activity", name="Daily ETL")
kc.add_vertex("dataset", type="resource", name="Sales DB")
kc.add_vertex("alice", type="actor", name="Alice")
kc.add_vertex("etl-run", type="activity", name="Daily ETL")
kc.add_vertex("dataset1", type="resource", name="JSON Records")
kc.add_vertex("dataset2", type="resource", name="Sales DB")

kc.add_edge("e1", type="performs", vertices={"alice", "etl-run"}, role="lead")
kc.add_edge("e2", type="requires", vertices={"etl-run", "dataset"}, mode="write")
kc.add_edge("e3", type="responsible", vertices={"alice", "dataset"}, level="owner")
kc.add_edge("e1", type="performs", vertices={"alice", "etl-run"}, role="lead")
kc.add_edge("e2", type="requires", vertices={"etl-run", "dataset1"}, mode="read")
kc.add_edge("e3", type="produces", vertices={"etl-run", "dataset2"}, mode="write")
kc.add_edge("e4", type="accesses", vertices={"alice", "dataset1"}, mode="read")
kc.add_edge("e5", type="responsible", vertices={"alice", "dataset2"}, level="owner")

kc.add_face("op1", type="operation", edges={"e1", "e2", "e3"})
kc.add_face("op1", type="operation", boundary=["e1", "e2", "e4"])
kc.add_face("prod1", type="production", boundary=["e1", "e3", "e5"])

# 3. Query
df = kc.query("vertices") # built-in SPARQL template
Expand All @@ -61,17 +68,120 @@ print(df)
print(kc.dump_graph()) # Turtle string
```

## The `kc:uri` attribute
See [`examples/`](examples/) for 10 runnable examples covering all features below.

## Topological queries

Every `KnowledgeComplex` has methods for the standard simplicial complex operations.
All return `set[str]` for natural set algebra:

```python
kc.boundary("face-1") # {e1, e2, e3} — direct boundary
kc.star("alice") # all simplices containing alice
kc.link("alice") # Cl(St) \ St — the horizon around alice
kc.closure({"e1", "e2"}) # smallest subcomplex containing these edges
kc.degree("alice") # number of incident edges

# Set algebra composes naturally
shared = kc.star("alice") & kc.star("bob")
```

All operators accept an optional `type=` filter for OWL-subclass-aware filtering.

## Clique inference

Discover higher-order structure from the edge graph:

```python
from knowledgecomplex import find_cliques, infer_faces

triangles = find_cliques(kc, k=3) # pure query — what triangles exist?
infer_faces(kc, "operation") # fill in all triangles as typed faces
infer_faces(kc, "team", edge_type="collab") # restrict to specific edge types
```

## Visualization

Two complementary views — Hasse diagrams (all elements as nodes, boundary as directed arrows) and geometric realization (vertices as 3D points, edges as lines, faces as filled triangles):

```python
from knowledgecomplex import plot_hasse, plot_geometric

fig, ax = plot_hasse(kc) # directed boundary graph, type-colored
fig, ax = plot_geometric(kc) # 3D triangulation with matplotlib
```

Every element (vertex, edge, or face) can carry an optional `kc:uri` property pointing to its source file:
Export to NetworkX for further analysis:

```python
kc.add_vertex("alice", type="actor", uri="file:///actors/alice.md", name="Alice")
kc.add_edge("e1", type="performs", vertices={"alice", "etl-run"},
uri="file:///edges/e1.md", role="lead")
from knowledgecomplex import to_networkx, verify_networkx

G = to_networkx(kc) # nx.DiGraph with exact degree invariants
verify_networkx(G) # validate cardinality + closed-triangle constraints
```

SHACL enforces at-most-one `kc:uri` per element. This is useful for domain applications where each element corresponds to an actual document or record.
## Algebraic topology

Betti numbers, Euler characteristic, Hodge Laplacian, and edge PageRank (requires `pip install knowledgecomplex[analysis]`):

```python
from knowledgecomplex import betti_numbers, euler_characteristic, edge_pagerank

betti = betti_numbers(kc) # [beta_0, beta_1, beta_2]
chi = euler_characteristic(kc) # V - E + F
pr = edge_pagerank(kc, "e1") # personalized edge PageRank vector
```

## Filtrations and time-varying complexes

Filtrations model strictly growing subcomplexes. Diffs model arbitrary add/remove sequences:

```python
from knowledgecomplex import Filtration, ComplexDiff, ComplexSequence

filt = Filtration(kc)
filt.append_closure({"v1", "v2", "e12"}) # Q0: founders
filt.append_closure({"v3", "e23", "face1"}) # Q1: first triangle
print(filt.birth("face1")) # 1

diff = ComplexDiff().add_vertex("eve", type="Person").remove("old-edge")
diff.apply(kc) # mutate the complex
print(diff.to_sparql(kc)) # export as SPARQL UPDATE
```

## I/O and codecs

Multi-format serialization and round-trip with external files:

```python
from knowledgecomplex import save_graph, load_graph, MarkdownCodec

save_graph(kc, "data.jsonld", format="json-ld")
load_graph(kc, "data.ttl") # additive loading

codec = MarkdownCodec(frontmatter_attrs=["name"], section_attrs=["notes"])
kc.register_codec("Paper", codec)
kc.element("paper-1").compile() # KC -> markdown file
kc.element("paper-1").decompile() # markdown file -> KC
```

## Constraint escalation

Escalate topological queries to SHACL constraints enforced on every write:

```python
sb.add_topological_constraint(
"requirement", "coboundary",
target_type="verification",
predicate="min_count", min_count=1,
message="Every requirement must have at least one verification edge",
)
```

## The `kc:uri` attribute

Every element can carry an optional `kc:uri` property pointing to its source file.
SHACL enforces at-most-one `kc:uri` per element.

## Architecture

Expand Down
1 change: 1 addition & 0 deletions docs/api/analysis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: knowledgecomplex.analysis
1 change: 1 addition & 0 deletions docs/api/clique.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: knowledgecomplex.clique
1 change: 1 addition & 0 deletions docs/api/codecs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: knowledgecomplex.codecs.markdown
1 change: 1 addition & 0 deletions docs/api/diff.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: knowledgecomplex.diff
1 change: 1 addition & 0 deletions docs/api/filtration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: knowledgecomplex.filtration
1 change: 1 addition & 0 deletions docs/api/io.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: knowledgecomplex.io
1 change: 1 addition & 0 deletions docs/api/viz.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: knowledgecomplex.viz
Loading
Loading