A comprehensive framework for Distributional Conformal Prediction (DCP) using Conditional Kernel Mean Embedding (CKME) methods.
- Overview
- Project Structure
- Installation
- Quick Start
- Module Documentation
- Usage Examples
- Experiments
- Dependencies
This project implements CKME-based conformal prediction methods and compares them with benchmark DCP methods. The framework supports:
- DCP-CKME: Distributional Conformal Prediction using CKME
- CKME-CKME: CKME-native distributional score method
- Benchmark Methods: DCP-QR, DCP-QR*, DCP-DR, HetGP (implemented in R)
- Unified data generation for multiple examples (MG1, Exp1)
- Parameter optimization using K-Fold Cross-Validation with MMD loss
- Support for multiple macro-replications
- Automatic result evaluation (coverage, width, interval score)
- Comparison plotting and summary tables
- Consistent data sharing between Python and R code
CKME_CP_111625/
├── core/ # Core modules
│ ├── models/ # CKME model and weight computation
│ │ └── ckme_weights.py
│ ├── loss/ # Loss functions
│ │ └── mmd_loss.py # Maximum Mean Discrepancy loss
│ ├── optimizers/ # Parameter optimization
│ │ └── optimizer.py # K-Fold CV optimizer
│ └── predictors/ # Prediction methods
│ └── dcp_predictor.py # DCP predictor
│
├── data_generators/ # Data generation module
│ ├── generator.py # Unified data generator
│ ├── example_usage.py # Usage examples
│ └── README.md # Data generator documentation
│
├── examples/ # Example definitions
│ ├── mg1.py # MG1 queue system example
│ └── exp1.py # Exp1 sinusoidal example
│
├── evaluation/ # Performance evaluation
│ └── evaluator.py # Coverage, width, interval score
│
├── utils/ # Utility functions
│ └── kernels.py # Kernel functions (RBF, Silverman bandwidth)
│
├── experiments/ # Experiment scripts
│ └── exp1/ # Exp1 experiment
│ ├── generate_data.py # Data generation script
│ ├── runner.py # Experiment runner
│ ├── run_dcp_ckme.py # DCP-CKME experiment
│ ├── run_dcp_methods.R # Benchmark DCP methods (R)
│ └── plot_comparison.py # Result comparison plots
│
├── benchmark/ # Benchmark methods
│ └── dcp_r/ # R implementation of DCP methods
│ └── functions_final.R
│
├── data/ # Generated data (timestamped folders)
├── results/ # Experiment results
│
├── run_examples.py # Run data generator examples
└── README.md # This file
- Python 3.7+
- R 4.0+ (for benchmark methods)
pip install numpy pandas matplotlib seaborn scikit-learninstall.packages(c("quantreg", "hetGP", "dplyr"))First, generate data for your experiment:
# Using command line
python experiments/exp1/generate_data.py \
--example exp1 \
--n_train_points 100 \
--n_reps 10 \
--n_test_points 100 \
--n_test_reps 1 \
--n_macrorep 50 \
--base_seed 42Or in Python:
from experiments.exp1.generate_data import generate_data
data_dir = generate_data(
example='exp1',
n_train_points=100,
n_reps=10,
n_test_points=100,
n_test_reps=1,
n_macrorep=50
)
# Returns: 'data/data_YYYYMMDD_HHMMSS'from experiments.exp1.run_dcp_ckme import run_dcp_ckme_experiment
results_df, macrorep_df, data_dir = run_dcp_ckme_experiment(
example='exp1',
data_dir='data/data_20250101_120000' # Use generated data
)Edit experiments/exp1/run_dcp_methods.R and update data_dir:
data_dir <- "data/data_20250101_120000"Then run:
Rscript experiments/exp1/run_dcp_methods.Rfrom experiments.exp1.plot_comparison import main
main()CKMEModel: Computes conditional kernel mean embedding weights.
Key Methods:
compute_weights(x_star): Compute weights for a single test pointcompute_weights_batch(X_star): Batch computation for multiple test pointssimplex_weights(x_star): Normalized weights (sum to 1)
Parameters:
ell_x: Length scale for X-space kernellam: Regularization parametersigma_y: Bandwidth for Y-space kernel
MMDLoss: Maximum Mean Discrepancy loss for parameter optimization.
Formula:
MMD² = (1/n) Σᵢ [k_Y(yᵢ, yᵢ) - 2 Σⱼ wⱼ(xᵢ) k_Y(yⱼ, yᵢ) + Σⱼ Σₖ wⱼ(xᵢ) wₖ(xᵢ) k_Y(yⱼ, yₖ)]
ParameterOptimizer: K-Fold Cross-Validation parameter optimizer.
Features:
- Uses GroupKFold to split by unique x values (sites)
- Grid search over
ell_x,lam, andsigma_ymultipliers - Computes MMD loss on validation sets
- Returns best parameters based on average validation loss
Parameter Grids:
ell_x: [10⁻², 10⁻¹·⁵, 10⁻¹, 10⁻⁰·⁵, 10⁰]lam: [1e-3, 1e-1, 10.0]sigma_y_multipliers: [0.5, 1.0, 2.0] × Silverman bandwidth
DCPPredictor: Distributional Conformal Prediction predictor.
Key Methods:
compute_ecdf(x_star, y_values): Compute weighted empirical CDFcompute_calibration_scores(X_cal, Y_cal): Compute DCP calibration scorespredict_interval(x_star, calibration_scores, alpha): Predict conformal interval
DataGenerator: Unified data generator for MG1 and Exp1 examples.
Features:
- Generate training data with replicates
- Split training data by replicates (not random)
- Generate independent test data
- Support multiple macro-replications
- Auto-save to timestamped folders with metadata
See data_generators/README.md for detailed documentation.
MG1 queue system example:
- True function: ζ(x) = 1.5x²/(1-x)
- Noise variance: r(x) = x(20 + 121x - 116x² + 29x³) / (4(1-x)⁴ · 2500)
- Default X range: (0.1, 0.9)
Exp1 sinusoidal example:
- True function: ζ(x) = exp(x/10) · sin(x)
- Noise variance: r(x) = (0.01 + 0.2(x - π)²)²
- Default X range: (0, 2π)
PerformanceEvaluator: Performance metrics for conformal prediction.
Metrics:
- Coverage: Whether true value falls within interval (0 or 1)
- Width: Interval width (upper - lower)
- Interval Score: width + (2/α) · max(0, lower - y) + (2/α) · max(0, y - upper)
- Half Width: 1.96 · std / √n (95% CI half width)
Aggregation:
- Coverage/Width: Average across macro-replications for each test point
- Score: Average across all points within each macro-rep, then across macro-reps
Kernel functions:
rbf_kernel(X, Y, sigma): RBF (Gaussian) kernelsilverman_bandwidth(Y): Silverman's rule of thumb for bandwidth
from data_generators import DataGenerator
import numpy as np
gen = DataGenerator(
example='exp1',
x_range=(0, 2 * np.pi),
n_train_points=100,
n_reps=10,
n_test_points=100,
n_test_reps=1,
seed=42
)
# Generate and split data
X_train_full, Y_train_full = gen.generate_training_data()
X_train, Y_train, X_cal, Y_cal = gen.split_training_data(X_train_full, Y_train_full)
X_test, Y_test = gen.generate_test_data()from experiments.exp1.run_dcp_ckme import run_dcp_ckme_experiment
results_df, macrorep_df, data_dir = run_dcp_ckme_experiment(
example='exp1',
n_train_points=200,
n_reps=10,
n_test_points=100,
n_test_reps=1,
n_macrorep=50,
alpha=0.1,
optimize_params=True,
base_seed=42,
data_dir=None # Will generate new data
)# Use previously generated data
results_df, macrorep_df, _ = run_dcp_ckme_experiment(
example='exp1',
data_dir='data/data_20250101_120000' # Use existing data
)from core import CKMEModel, ParameterOptimizer
from data_generators import DataGenerator
import numpy as np
# Generate data
gen = DataGenerator(example='exp1', x_range=(0, 2*np.pi),
n_train_points=100, n_reps=10, seed=42)
X_train, Y_train, _, _ = gen.split_training_data(*gen.generate_training_data())
# Optimize parameters
class Config:
optimize_params = True
optimizer = ParameterOptimizer(Config(), k=5)
class Params:
def __init__(self, ell_x, lam, sigma_y):
self.ell_x = ell_x
self.lam = lam
self.sigma_y = sigma_y
best_params = optimizer.optimize(X_train, Y_train, Params)
print(f"Best ell_x: {best_params.ell_x}, lam: {best_params.lam}, sigma_y: {best_params.sigma_y}")The Exp1 experiment compares DCP-CKME with benchmark DCP methods on the Exp1 example.
Workflow:
-
Generate Data:
python experiments/exp1/generate_data.py --example exp1 --n_macrorep 50
-
Run DCP-CKME (Python):
from experiments.exp1.run_dcp_ckme import run_dcp_ckme_experiment run_dcp_ckme_experiment(example='exp1', data_dir='data/data_YYYYMMDD_HHMMSS')
-
Run Benchmark Methods (R):
# Edit run_dcp_methods.R: data_dir <- "data/data_YYYYMMDD_HHMMSS" Rscript experiments/exp1/run_dcp_methods.R
-
Generate Comparison Plots:
from experiments.exp1.plot_comparison import main main()
Output:
- Coverage comparison plot
- Width comparison plot
- Summary table with mean ± 95% CI half width
Similar workflow, use example='mg1' in all scripts.
To ensure Python and R code use the same data:
- Always generate data first using
generate_data.py - Use the same
data_dirin both Python and R scripts - Check metadata.txt to verify data configuration
The data generation module automatically:
- Creates timestamped folders:
data/data_YYYYMMDD_HHMMSS/ - Saves metadata.txt with configuration information
- Uses fixed random seeds for reproducibility
numpy: Numerical computationspandas: Data processing and CSV operationsmatplotlib: Plottingseaborn: Statistical visualizationscikit-learn: Machine learning utilities (for GroupKFold)
quantreg: Quantile regression (for DCP-QR methods)hetGP: Heteroscedastic Gaussian Process (for HetGP)dplyr: Data manipulation
- Data files:
train_{macrorep_id}.csv,cal_{macrorep_id}.csv,test_{macrorep_id}.csv - Result files:
{example}_dcp_ckme_{config}_{timestamp}.csv - Data directories:
data/data_{timestamp}/ - Result directories:
results/{example}_dcp_ckme_{config}_{timestamp}/
- Data Splitting: Always splits by replicates, ensuring each location appears in both train and calibration sets
- Parameter Optimization: Uses K-Fold CV with GroupKFold to prevent information leakage
- Score Aggregation: Interval score is averaged across all points within each macro-rep, then across macro-reps
- Coverage/Width Aggregation: Averaged across macro-replications for each test point
- Random Seeds: Fixed seeds ensure reproducibility; each macro-rep uses
base_seed + macrorep_id