|
1 | 1 | --- |
2 | | -title: "TensorDB v0.3.0: A Rust-Native Bitemporal Ledger Database with Full SQL and PostgreSQL Wire Protocol" |
| 2 | +title: "I Built a Database That Never Forgets — Here's Why" |
3 | 3 | published: true |
4 | | -description: "TensorDB v0.3.0 ships 6 phases of features including recursive CTEs, Raft consensus, mTLS, column-level encryption, and 276ns point reads." |
| 4 | +description: "Most databases destroy history every time you UPDATE a row. I built one that doesn't. Here's the architecture behind TensorDB, a Rust bitemporal ledger database with 276ns reads." |
5 | 5 | tags: [rust, database, sql, opensource] |
| 6 | +cover_image: https://raw.githubusercontent.com/tensor-db/TensorDB/main/docs/cover.png |
6 | 7 | --- |
7 | 8 |
|
8 | | -# TensorDB v0.3.0: A Rust-Native Bitemporal Ledger Database |
| 9 | +Last year, a financial services team I was working with had a nightmare scenario: a regulatory audit required them to prove exactly what their system showed on a specific Tuesday six months ago. Not what the data _currently_ says. What it said _then_. |
9 | 10 |
|
10 | | -If you have ever needed to answer the question *"what did we know, and when did we know it?"* — across financial records, audit trails, medical histories, or regulatory filings — you have probably hacked together a solution with `created_at` and `updated_at` columns, soft-deletes, and a prayer. TensorDB is built to make that question a first-class citizen of your database engine. |
| 11 | +Their production Postgres had the current state. Their audit table had some breadcrumbs. Their application logs were partially rotated. Reconstructing the answer took two engineers three weeks of forensic archaeology through backups, WAL archives, and prayer. |
11 | 12 |
|
12 | | -**TensorDB v0.3.0** is out, and it is a significant release. Six phases of work landed: full SQL completeness, advanced types, enterprise security, distributed consensus, WASM/FFI edge deployment, and a learned cost model. This post walks through what TensorDB is, how it works, and what is new. |
| 13 | +This is the problem that drove me to build [TensorDB](https://github.com/tensor-db/TensorDB). |
13 | 14 |
|
14 | | -## What Is TensorDB? |
15 | | - |
16 | | -TensorDB is an **embeddable, bitemporal, append-only ledger database** written entirely in Rust. It supports: |
| 15 | +--- |
17 | 16 |
|
18 | | -- **Bitemporal data model** — every record carries both *system time* (`commit_ts`) and *business time* (`valid_from` / `valid_to`) |
19 | | -- **MVCC with immutable storage** — facts are never overwritten; updates create new facts, deletes create tombstones |
20 | | -- **Full SQL engine** — hand-written recursive descent parser, cost-based planner, vectorized execution |
21 | | -- **LSM-tree storage** — WAL → Memtable → SSTables with LZ4/Zstd compression and L0–L6 compaction |
22 | | -- **PostgreSQL wire protocol** — connect with `psql`, `pgAdmin`, `asyncpg`, or any Postgres client |
23 | | -- **Embeddable** — link directly into your Rust binary, or use Python/Node.js bindings |
| 17 | +## The Problem With UPDATE |
24 | 18 |
|
25 | | -## The Bitemporal Model in 60 Seconds |
| 19 | +Here's what most databases do when you update a row: |
26 | 20 |
|
27 | | -Most databases track *one* timeline. Bitemporal databases track *two*: |
| 21 | +```sql |
| 22 | +UPDATE accounts SET balance = 5000 WHERE id = 1; |
| 23 | +``` |
28 | 24 |
|
29 | | -| Dimension | Column | Question answered | |
30 | | -|-----------|--------|-------------------| |
31 | | -| System time | `commit_ts` | When did the database record this fact? | |
32 | | -| Business time | `valid_from` / `valid_to` | When was this fact true in the real world? | |
| 25 | +The old value is gone. Destroyed. Overwritten. If you need history, you build it yourself — trigger-based audit tables, event sourcing patterns, CDC pipelines feeding into a data lake. You end up with a Rube Goldberg machine of infrastructure just to answer _"what was this value last week?"_ |
33 | 26 |
|
34 | | -```sql |
35 | | --- What did our inventory system show last Tuesday? |
36 | | -SELECT * FROM inventory AS OF SYSTEM TIME '2026-03-01 09:00:00'; |
| 27 | +**Bitemporal databases solve this at the storage layer.** Every write is an immutable fact. Nothing is ever overwritten or deleted. The database tracks two independent timelines for every record: |
37 | 28 |
|
38 | | --- What was the contractual price on Jan 1, even if we corrected it later? |
39 | | -SELECT * FROM pricing VALID AT DATE '2026-01-01'; |
| 29 | +| Timeline | What it tracks | Example question | |
| 30 | +|----------|---------------|-----------------| |
| 31 | +| **System time** | When the database _recorded_ this fact | "What did our system show last Tuesday?" | |
| 32 | +| **Business time** | When this fact was _true in the real world_ | "What was the contract price on Jan 1?" | |
40 | 33 |
|
41 | | --- Full SQL:2011 temporal range |
42 | | -SELECT * FROM orders |
43 | | -FOR SYSTEM_TIME FROM '2026-01-01' TO '2026-03-01'; |
44 | | -``` |
| 34 | +The distinction matters more than you'd think. A bank discovers today that a transaction from January had the wrong amount. With a bitemporal model, you correct the business-time record while preserving the system-time history of what you _previously believed_. Both truths coexist. Auditors can see both. |
45 | 35 |
|
46 | | -No extra tables, no shadow schemas, no application-layer bookkeeping. |
| 36 | +--- |
47 | 37 |
|
48 | | -## Quick Start |
| 38 | +## See It in 30 Seconds |
49 | 39 |
|
50 | | -### Rust (embedded) |
| 40 | +You can have TensorDB running in under a minute: |
51 | 41 |
|
52 | | -```toml |
53 | | -[dependencies] |
54 | | -tensordb = "0.3" |
| 42 | +```bash |
| 43 | +pip install tensordb |
55 | 44 | ``` |
56 | 45 |
|
57 | | -```rust |
58 | | -use tensordb::Database; |
| 46 | +```python |
| 47 | +from tensordb import PyDatabase |
59 | 48 |
|
60 | | -fn main() -> tensordb::Result<()> { |
61 | | - let db = Database::open("./mydb")?; |
| 49 | +db = PyDatabase.open("/tmp/demo") |
62 | 50 |
|
63 | | - db.sql("CREATE TABLE accounts ( |
64 | | - id INTEGER PRIMARY KEY, owner TEXT, balance REAL |
65 | | - )")?; |
| 51 | +# Create a table and insert data |
| 52 | +db.sql("CREATE TABLE accounts (id INT, owner TEXT, balance REAL)") |
| 53 | +db.sql("INSERT INTO accounts VALUES (1, 'Alice', 10000)") |
66 | 54 |
|
67 | | - db.sql("INSERT INTO accounts VALUES (1, 'alice', 10000.00)")?; |
| 55 | +# Update the balance |
| 56 | +db.sql("UPDATE accounts SET balance = 7500 WHERE id = 1") |
68 | 57 |
|
69 | | - // Time-travel query |
70 | | - let rows = db.sql( |
71 | | - "SELECT * FROM accounts AS OF SYSTEM TIME '2026-03-01'" |
72 | | - )?; |
73 | | - println!("{:?}", rows); |
74 | | - Ok(()) |
75 | | -} |
| 58 | +# Time-travel: what was Alice's balance BEFORE the update? |
| 59 | +rows = db.sql("SELECT * FROM accounts FOR SYSTEM_TIME ALL WHERE id = 1") |
| 60 | +print(rows) |
| 61 | +# → Both versions: the 10000 AND the 7500, with timestamps |
76 | 62 | ``` |
77 | 63 |
|
78 | | -### Python |
| 64 | +That's it. No configuration. No schema migration for audit columns. No background workers. The history is automatic. |
| 65 | + |
| 66 | +Or if you prefer Rust: |
79 | 67 |
|
80 | 68 | ```bash |
81 | | -pip install tensordb |
| 69 | +cargo add tensordb |
82 | 70 | ``` |
83 | 71 |
|
84 | | -```python |
85 | | -from tensordb import PyDatabase |
| 72 | +```rust |
| 73 | +let db = tensordb::Database::open("./mydb")?; |
86 | 74 |
|
87 | | -db = PyDatabase.open("./mydb") |
88 | | -db.sql("CREATE TABLE trades (id INT, symbol TEXT, price REAL)") |
89 | | -db.sql("INSERT INTO trades VALUES (1, 'AAPL', 182.50), (2, 'TSLA', 245.00)") |
90 | | -rows = db.sql("SELECT * FROM trades WHERE price > 200") |
91 | | -print(rows) |
| 75 | +db.sql("CREATE TABLE events (id INT PRIMARY KEY, type TEXT, amount REAL)")?; |
| 76 | + |
| 77 | +db.sql("INSERT INTO events VALUES |
| 78 | + (1, 'deposit', 1000), |
| 79 | + (2, 'withdrawal', 250), |
| 80 | + (3, 'deposit', 500)")?; |
| 81 | + |
| 82 | +// What did the ledger look like at any point in time? |
| 83 | +let snapshot = db.sql( |
| 84 | + "SELECT * FROM events AS OF SYSTEM TIME '2026-03-07 12:00:00'" |
| 85 | +)?; |
92 | 86 | ``` |
93 | 87 |
|
94 | | -### Connect with psql |
| 88 | +--- |
| 89 | + |
| 90 | +## Why Should You Care? |
| 91 | + |
| 92 | +### It's Fast. Really Fast. |
| 93 | + |
| 94 | +| Operation | TensorDB | SQLite (WAL) | Factor | |
| 95 | +|-----------|----------|-------------|--------| |
| 96 | +| Point read | **276 ns** | ~400 ns | 1.4x faster | |
| 97 | +| Point write | **1.9 us** | ~15 us | **8x faster** | |
| 98 | +| Batch insert (10k rows) | **18 ms** | ~35 ms | 2x faster | |
| 99 | + |
| 100 | +These aren't synthetic benchmarks on a tuned cluster. This is single-node, embedded, with full durability guarantees. The write path uses lock-free atomic CAS — no mutexes, no channels, no actor messages on the hot path. |
| 101 | + |
| 102 | +### It Speaks PostgreSQL |
95 | 103 |
|
96 | 104 | ```bash |
97 | | -cargo run -p tensordb-server -- --data-dir ./mydb --port 5433 |
| 105 | +# Start the server |
| 106 | +tensordb-server --data-dir ./mydb --port 5433 |
| 107 | + |
| 108 | +# Connect with literally anything that speaks Postgres |
98 | 109 | psql -h localhost -p 5433 -d mydb |
99 | 110 | ``` |
100 | 111 |
|
101 | | -## Performance |
| 112 | +Your existing tools work — psql, pgAdmin, DBeaver, SQLAlchemy, Prisma, any Postgres driver. You get standard SQL plus temporal queries that Postgres doesn't natively support: |
| 113 | + |
| 114 | +```sql |
| 115 | +-- Standard SQL |
| 116 | +CREATE TABLE orders (id SERIAL PRIMARY KEY, customer TEXT, total REAL); |
| 117 | +INSERT INTO orders (customer, total) VALUES ('acme', 9999) RETURNING id; |
| 118 | + |
| 119 | +-- Temporal queries (the superpower) |
| 120 | +SELECT * FROM orders AS OF SYSTEM TIME '2026-01-15'; |
| 121 | +SELECT * FROM orders FOR SYSTEM_TIME FROM '2026-01-01' TO '2026-03-01'; |
| 122 | +SELECT * FROM orders VALID AT DATE '2026-02-15'; |
| 123 | +``` |
| 124 | + |
| 125 | +### It Embeds in Your Binary |
| 126 | + |
| 127 | +No daemon process. No Docker container. No ops overhead. One function call: |
| 128 | + |
| 129 | +```rust |
| 130 | +let db = Database::open("./path")?; |
| 131 | +``` |
| 132 | + |
| 133 | +Ship the database _inside_ your application. Ideal for edge deployments, CLI tools, desktop apps, or anywhere a full Postgres deployment is overkill. |
102 | 134 |
|
103 | | -| Operation | TensorDB | SQLite (WAL mode) | |
104 | | -|-----------|----------|-------------------| |
105 | | -| Point read | **276 ns** | ~400 ns | |
106 | | -| Point write | **1.9 µs** | ~15 µs | |
107 | | -| Batch insert (10k rows) | ~18 ms | ~35 ms | |
| 135 | +### The SQL Surface Is Complete |
108 | 136 |
|
109 | | -The fast write path uses atomic CAS for lock-free writes. Direct reads bypass shard actors via `ShardReadHandle` with `parking_lot::RwLock`. |
| 137 | +This isn't a toy query language. It's a full SQL engine with a hand-written recursive descent parser, cost-based query planner, and vectorized execution: |
110 | 138 |
|
111 | | -## What Is New in v0.3.0 |
| 139 | +- **DDL/DML:** `CREATE TABLE`, `ALTER TABLE`, `INSERT ... ON CONFLICT` (upsert), `UPDATE ... RETURNING`, `DELETE ... RETURNING` |
| 140 | +- **Queries:** JOINs (inner, left, right, full outer, cross), subqueries, CTEs (including `WITH RECURSIVE`), window functions, `GROUP BY`/`HAVING`, `UNION`/`INTERSECT`/`EXCEPT` |
| 141 | +- **Types:** `INTEGER`, `REAL`, `TEXT`, `BOOLEAN`, `DATE`, `TIMESTAMP`, `INTERVAL`, `JSON` |
| 142 | +- **Functions:** 50+ built-in (string, numeric, date/time, aggregate, window) |
| 143 | +- **Advanced:** foreign keys, materialized views, triggers, user-defined functions, generated columns, JSON operators (`->`, `->>`, `@>`) |
112 | 144 |
|
113 | | -### SQL Completeness |
114 | | -OFFSET, IF EXISTS, multi-value INSERT, FULL OUTER JOIN, RETURNING on UPDATE/DELETE, subqueries (IN, EXISTS, scalar), upsert (ON CONFLICT), persistent sessions. |
| 145 | +And the error messages are actually helpful: |
115 | 146 |
|
116 | | -### Advanced SQL |
117 | | -Native date/time types, JSON operators (`->`, `->>`, `@>`), generated columns, recursive CTEs, foreign keys, materialized views, triggers, user-defined functions. |
| 147 | +``` |
| 148 | +ERROR T2001: Table "ordres" not found. Did you mean "orders"? |
| 149 | +``` |
| 150 | + |
| 151 | +--- |
118 | 152 |
|
119 | | -### Performance |
120 | | -Zstd compression policies, batch write optimization, external merge sort, expression compilation, query parallelism with rayon. |
| 153 | +## How It Works Under the Hood |
121 | 154 |
|
122 | | -### Enterprise Security |
123 | | -Audit log tamper detection (SHA-256 hash chains), mTLS, encryption key rotation, column-level encryption (AES-256-GCM). |
| 155 | +For those who like to understand the machinery. |
124 | 156 |
|
125 | | -### Distributed |
126 | | -Raft consensus via gRPC, S3 storage backend, WAL replication, WASM/FFI edge deployment. |
| 157 | +### Immutable Key Encoding |
127 | 158 |
|
128 | | -### Category Differentiation |
129 | | -Learned cost model, anomaly detection, graph queries, in-database ML (linear/logistic regression). |
| 159 | +Every record gets this internal key: |
130 | 160 |
|
131 | | -## Links |
| 161 | +``` |
| 162 | +user_key || 0x00 || commit_ts (8B big-endian) || kind (1B) |
| 163 | +``` |
| 164 | + |
| 165 | +The `user_key` prefix means prefix scans retrieve all versions. Big-endian timestamps give chronological ordering for free. The `kind` byte distinguishes puts from tombstones. Updates don't modify anything — they append new facts with higher timestamps. |
| 166 | + |
| 167 | +### LSM Storage Stack |
| 168 | + |
| 169 | +``` |
| 170 | +Write ─→ WAL (CRC-framed) ─→ Memtable (BTreeMap) |
| 171 | + │ flush |
| 172 | + ▼ |
| 173 | + L0 SSTables (sorted) |
| 174 | + │ compaction |
| 175 | + ▼ |
| 176 | + L1 → L2 → ... → L6 |
| 177 | + (LZ4 for L0-L2, Zstd for L3+) |
| 178 | + (bloom filters, block cache) |
| 179 | +``` |
| 180 | + |
| 181 | +**Lock-free writes:** `AtomicU64::compare_exchange` claims a commit timestamp, then writes directly to memtable. No locks on the hot path. |
| 182 | + |
| 183 | +**Direct reads:** `ShardReadHandle` with `parking_lot::RwLock` bypasses shard actors entirely. This is how reads hit 276ns. |
| 184 | + |
| 185 | +**Batched durability:** A `DurabilityThread` coalesces WAL fsyncs across shards on a 1ms interval. Individual writes don't pay fsync cost. |
| 186 | + |
| 187 | +### Cost-Based Query Planner |
| 188 | + |
| 189 | +The planner evaluates plan variants — `PointLookup`, `IndexScan`, `FullScan`, `HashJoin` — using table statistics. A learned cost model tracks actual vs. estimated cardinalities and adjusts its estimates from observed query performance. |
| 190 | + |
| 191 | +--- |
| 192 | + |
| 193 | +## Production-Ready Features |
| 194 | + |
| 195 | +Things you'll need when you go beyond prototyping: |
| 196 | + |
| 197 | +**Security:** RBAC with users/roles/permissions, row-level security policies, mTLS on pgwire, column-level AES-256-GCM encryption, encryption key rotation without downtime. |
| 198 | + |
| 199 | +**Audit:** SHA-256 hash-chained audit log. Every DDL and DML event is recorded in a tamper-evident chain. Run `VERIFY AUDIT LOG` to cryptographically verify integrity. |
| 200 | + |
| 201 | +**GDPR:** `FORGET KEY 'user:42'` creates a cryptographic tombstone across all versions — satisfying right-to-erasure while preserving audit log structure. |
| 202 | + |
| 203 | +**Observability:** 8 diagnostic SQL commands — `SHOW STATS`, `SHOW SLOW QUERIES`, `SHOW ACTIVE QUERIES`, `SHOW STORAGE`, `SHOW COMPACTION STATUS`, `SHOW WAL STATUS`, `SHOW AUDIT LOG`, `SHOW PLAN GUIDES`. Plus a health HTTP endpoint. |
| 204 | + |
| 205 | +**Specialized engines:** Full-text search (BM25), time-series (bucketing, gap fill, LOCF, interpolation), vector search (HNSW + IVF-PQ), event sourcing, graph queries. |
| 206 | + |
| 207 | +--- |
| 208 | + |
| 209 | +## The Honest Comparison |
| 210 | + |
| 211 | +**Use Postgres** if you need a battle-tested, general-purpose OLTP database with 30 years of production hardening and a massive extension ecosystem. |
| 212 | + |
| 213 | +**Try TensorDB** if: |
| 214 | +- Bitemporality is your _primary requirement_, not an afterthought bolted on with triggers |
| 215 | +- You want to embed the database directly in your application |
| 216 | +- You need structurally append-only storage for compliance (not just "we log changes") |
| 217 | +- Sub-microsecond embedded reads matter to you |
| 218 | + |
| 219 | +TensorDB is younger software. It doesn't have Postgres's ecosystem depth. But for the specific problem it solves — immutable, bitemporal, embedded storage with full SQL — it's purpose-built. |
| 220 | + |
| 221 | +--- |
| 222 | + |
| 223 | +## Get Started in 60 Seconds |
| 224 | + |
| 225 | +Pick your language: |
| 226 | + |
| 227 | +```bash |
| 228 | +# Rust — embed in your binary |
| 229 | +cargo add tensordb |
| 230 | + |
| 231 | +# Python — pip install and go |
| 232 | +pip install tensordb |
| 233 | + |
| 234 | +# Any language — connect via PostgreSQL protocol |
| 235 | +cargo install tensordb-server |
| 236 | +tensordb-server --data-dir ./mydb --port 5433 |
| 237 | +# Then: psql -h localhost -p 5433 |
| 238 | +``` |
| 239 | + |
| 240 | +**Links:** |
| 241 | +- [GitHub](https://github.com/tensor-db/TensorDB) — star it if you find it useful |
| 242 | +- [Documentation](https://tensor-db.github.io/TensorDB/) — quickstart, SQL reference, architecture guide |
| 243 | +- [PyPI](https://pypi.org/project/tensordb/) — `pip install tensordb` |
| 244 | +- [crates.io](https://crates.io/crates/tensordb) — `cargo add tensordb` |
| 245 | + |
| 246 | +--- |
132 | 247 |
|
133 | | -- **GitHub**: [tensor-db/TensorDB](https://github.com/tensor-db/TensorDB) |
134 | | -- **Docs**: [tensor-db.github.io/TensorDB](https://tensor-db.github.io/TensorDB/) |
135 | | -- **crates.io**: [tensordb](https://crates.io/crates/tensordb) |
136 | | -- **PyPI**: [tensordb](https://pypi.org/project/tensordb/) |
137 | | -- **npm**: [tensordb](https://www.npmjs.com/package/tensordb) |
| 248 | +If you're building financial systems, compliance infrastructure, audit trails, healthcare records, or anything where _the history of data matters as much as the current state_ — give it a try and tell me what you think. I read every issue and discussion on GitHub. |
138 | 249 |
|
139 | | -Contributions, benchmarks, and feedback are welcome. Open an issue or discussion on GitHub. |
| 250 | +And if it breaks, file a bug. That's how it gets better. |
0 commit comments