Skip to content

Latest commit

 

History

History
189 lines (147 loc) · 8.14 KB

File metadata and controls

189 lines (147 loc) · 8.14 KB

Performance Benchmarks

This file contains performance measurements of JDBE under various dataset sizes and query workloads. All measurements were taken using the GUI with the dynamic dataset selector.

Dataset Configurations

JDBE supports five dataset sizes selectable via the GUI dropdown menu:

Small (1K records)

  • Users: 100 records
  • Products: 900 records
  • Total: 1,000 records
  • Load Time: 240ms
  • Throughput: ~4,167 records/sec
  • Use Case: Quick demos, syntax testing, instant feedback

Medium (35K records)

  • Users: 10,000 records
  • Products: 25,000 records
  • Total: 35,000 records
  • Load Time: 274ms
  • Throughput: ~127,737 records/sec
  • Use Case: Development, unit testing, debugging

Large (350K records)

  • Users: 100,000 records
  • Products: 250,000 records
  • Total: 350,000 records
  • Load Time: 607ms
  • Throughput: ~576,674 records/sec
  • Use Case: Performance testing, optimization validation

XL (1.3M records) - Default

  • Users: 300,000 records
  • Products: 1,000,000 records
  • Total: 1,300,000 records
  • Load Time: 7.17 seconds
  • Throughput: ~181,312 records/sec
  • Use Case: Stress testing, production-scale simulation

XXL (5M records)

  • Users: 500,000 records
  • Products: 4,500,000 records
  • Total: 5,000,000 records
  • Load Time: 35.09 seconds
  • Throughput: ~142,495 records/sec
  • Use Case: Scalability limits, extreme stress testing

Query Performance Results

Small Dataset (1K records)

Query Operation Rows Returned Time Throughput
users |> project(id, name) Full scan 100 ~1ms Instant
products |> project(id, name, price) Full scan 900 ~1ms Instant
users |> filter(age > 25) |> sort(age desc) Filter + sort ~70 ~1ms Instant
products |> filter(price > 50) |> sort(price desc) Filter + sort ~220 ~1ms Instant

Notes: Small dataset provides instant feedback for all operations. Ideal for quick syntax testing and demonstrations.

Medium Dataset (35k records)

Query Operation Rows Returned Time Throughput
users |> project(id, name, age) Full scan 10,000 ~10ms 1M rows/sec
products |> project(id, name, price) Full scan 25,000 ~22ms 1.1M rows/sec
users |> filter(age > 25) |> sort(age desc) Filter + sort ~8,700 ~12ms Fast
products |> filter(price > 50) |> sort(price desc) Filter + sort ~6,000 ~25ms Fast
users |> filter(active == true) |> ... |> limit(5) Complex query 5 ~15ms Fast
products |> filter(stock > 10) |> ... |> limit(5) Complex query 5 ~30ms Fast

Large Dataset (350k records)

Query Operation Rows Returned Time Throughput
users |> project(id, name, age) Full scan 100,000 94ms 1.06M rows/sec
products |> project(id, name, price) Full scan 250,000 218ms 1.15M rows/sec
users |> filter(age > 25) |> sort(age desc) Filter + sort 87,605 99ms 885k rows/sec
products |> filter(price > 50) |> sort(price desc) Filter + sort 59,874 204ms 293k rows/sec
users |> filter(active == true) |> ... |> limit(5) Complex query 5 96ms Fast
products |> filter(stock > 10) |> ... |> limit(5) Complex query 5 310ms Fast

XL Dataset (1.3M records) - Default

Query Operation Rows Returned Time Throughput
users |> project(id, name) Full scan 300,000 1,868ms 161k rows/sec
products |> project(id, name, price) Full scan 1,000,000 2,179ms 459k rows/sec
products |> filter(price > 50) |> sort(price desc) Filter + sort 240,872 1,003ms 1M rows/sec
users |> filter(age > 25) |> sort(age desc) Filter + sort 263,098 384ms 781k rows/sec
users |> filter(active == true) |> ... |> limit(5) Complex query 5 416ms Fast
products |> filter(stock > 10) |> ... |> limit(5) Complex query 5 1,231ms Fast

XXL Dataset (5M records)

Query Operation Rows Returned Time Throughput
users |> project(id, name, age) Full scan 500,000 3,086ms 162k rows/sec
products |> project(id, name, price) Full scan 4,500,000 3,949ms 1.14M rows/sec
users |> filter(age > 25) |> sort(age desc) Filter + sort ~435,000 ~1,800ms 277k rows/sec
products |> filter(price > 50) Filter only ~1,080,000 ~1,200ms 900k rows/sec

Notes: Even at 5M records, full table scans complete in under 4 seconds, demonstrating excellent linear scaling.

Performance Analysis

Scaling Behavior

The engine demonstrates consistent linear O(n) scaling across all dataset sizes:

Small → Medium: 35x increase (1K → 35K) = ~10-22ms (instant to fast)
Medium → Large: 10x increase (35K → 350K) = ~10x query time
Large → XL: 3.7x increase (350K → 1.3M) = ~3-5x query time
XL → XXL: 3.8x increase (1.3M → 5M) = ~1.7x query time
Conclusion: Consistent linear O(n) scaling from 1K to 5M records

The sub-linear scaling from XL to XXL (4.7x time vs 3.8x data) is due to buffer pool efficiency and JIT optimization at larger scales.

Load Time Analysis

Dataset Records Load Time Throughput Notes
1K 1,000 240ms 4.2k/sec Startup overhead dominates
35K 35,000 274ms 128k/sec Optimal throughput
350K 350,000 607ms 577k/sec Peak throughput
1.3M 1,300,000 7.17s 181k/sec Buffer pool active
5M 5,000,000 35.09s 142k/sec Heavy LRU eviction

Observations:

  • 1K dataset has similar time to 35K due to startup/initialization overhead being around 240ms baseline
  • Peak throughput at 350K (577k records/sec) when dataset fits mostly in buffer pool
  • Consistent throughput at large scales (140-180k records/sec) despite buffer saturation
  • Sub-second loading for datasets ≤350K, making them ideal for development and testing
  • Very consistent results across multiple runs (variation <5% for most datasets)

Operation Breakdown

Sequential Scans:

  • Small dataset: ~1ms for 100-900 rows (instant)
  • Medium dataset: 10-22ms for 10k-25k rows
  • Large dataset: 94-218ms for 100k-250k rows
  • XL dataset: 1.87s-2.18s for 300k-1M rows
  • XXL dataset: 3.09s-3.95s for 500k-4.5M rows
  • Performance: Consistently 150k-1.1M rows/sec

Filter + Sort:

  • Medium dataset: 12-25ms for 6k-9k results
  • Large dataset: 99-204ms for 60k-88k results
  • XL dataset: 384ms-1s for 241k-263k results
  • Performance: Sub-second even on 200k+ sorted results

Complex Queries with Limit:

  • Limit optimization works well - returns only needed rows
  • Even on 1M record scans, limit queries complete in sub 1.5s

Memory Usage

Buffer Pool Statistics

Page size: 4 KB
Buffer pool: 5,000 pages
Total buffer memory: ~20 MB

Dataset Storage

Dataset Records Pages Required Disk Size File Size After Reload
1K records 1,000 ~12 pages ~48 KB ~48 KB
35K records 35,000 ~410 pages ~1.6 MB ~1.6 MB
350K records 350,000 ~4,100 pages ~16 MB ~16 MB
1.3M records 1,300,000 ~15,300 pages ~60 MB ~60 MB
5M records 5,000,000 ~58,800 pages ~230 MB ~230 MB

Database Cleanup: The GUI automatically deletes and recreates the database when switching dataset sizes, ensuring file size always matches actual data. No wasted space from previous loads.

LRU Eviction

  • 1K dataset: Zero eviction (fits in ~1% of buffer)
  • 35K dataset: Zero eviction (fits entirely in buffer)
  • 350K dataset: Minimal eviction (fits mostly in buffer)
  • 1.3M dataset: Active LRU eviction during queries
  • 5M dataset: Heavy LRU eviction, buffer saturated
  • Impact: 0% (small/medium), 5% (large), 10-20% (XL), 15-25% (XXL) query overhead