Automated PCB defect detection using YOLOv8. Train models, run evaluations, deploy to production — everything you need for computer vision in manufacturing.
Built for a university ML project, but it actually works. Runs on NVIDIA/AMD GPUs, Apple Silicon, and CPU-only systems.
Most PCB detection tutorials stop at "congrats, you trained a model!" This project goes further:
- Cross-platform - Works on whatever hardware you have (tested on NVIDIA, Apple M1, CPU)
- Actually deploys - REST API, Docker containers, performance monitoring
- Production features - ONNX optimization, INT8 quantization, robustness testing
- Research-friendly - Cross-validation, hyperparameter tuning, statistical analysis
- Honest logging - Everything auto-logs to
logs/so you can write reports
I needed something that would work for both my coursework AND potentially in real manufacturing. So here we are.
Fastest path (5 minutes):
# 1. Install
pip install -e ".[dev]"
# 2. Generate a tiny test dataset
python samples/generate_sample_dataset.py
# 3. Train (takes ~2 minutes on GPU, ~10 on CPU)
export PCB_CONFIG=$(pwd)/samples/sample_config.yaml
pcb-dd train-baseline --epochs 10
# 4. See results
ls logs/Full workflow (30-45 minutes):
# Runs everything: download dataset, train, evaluate, generate reports
python start.pyJust want to test the API?
# Use a pretrained model or train one first
pcb-dd train-baseline
pcb-dd api
# In another terminal
curl -X POST "http://localhost:8000/detect" \
-F "file=@path/to/pcb_image.jpg"- Training - YOLOv8n baseline (fast) or YOLOv8s production (accurate)
- Evaluation - Confusion matrices, per-class metrics, auto-generated reports
- Deployment - FastAPI server, ONNX export, Docker containers
- Multi-platform - Auto-detects your hardware (CUDA, ROCm, MPS, CPU)
- Failure analysis - Visualize what the model gets wrong
- Interpretability - Attention maps showing where the model looks
- Cross-validation - Statistical confidence intervals (takes hours)
- Hyperparameter tuning - Optuna-based search (also takes hours)
- Robustness testing - How well does it handle noise, blur, etc.
- INT8 quantization - 3-4x speedup for deployment
The advanced stuff is there because I wanted to learn it. Your mileage may vary.
pcb-defect-detection/
├── src/ # Main codebase
│ ├── training/ # Baseline, production, transfer learning
│ ├── evaluation/ # Metrics, confusion, interpretability
│ ├── deployment/ # API, ONNX export, quantization
│ ├── analysis/ # Dataset stats, failure cases
│ └── cli.py # Command-line interface
│
├── tests/ # Unit tests (basic coverage)
├── samples/ # Sample dataset generator
├── docker/ # Docker configs
├── docs/ # Detailed guides
│
├── start.py # One-command full workflow
├── auto_analyze.py # Generate all reports
└── requirements.txt # Dependencies
After training, check these directories:
runs/train/- Model checkpoints and training curveslogs/- All reports, metrics, visualizationsoutputs/- Final models and deployment artifacts
Recommended (virtual environment):
# Create environment
python -m venv .venv
source .venv/bin/activate # or `.venv\Scripts\activate` on Windows
# Install project
pip install -e ".[dev]"
# Platform-specific packages (PyTorch, ONNX Runtime)
python install.pyThe install.py script auto-detects your GPU and installs the right PyTorch/ONNX builds. If it fails, check docs/troubleshooting.md.
Requirements:
- Python 3.10 or 3.11 (3.12 works but 3.11 recommended)
- 8GB+ RAM
- GPU recommended but not required
# Fast baseline (good enough for most cases)
pcb-dd train-baseline --epochs 50
# Higher accuracy production model
pcb-dd train-production --epochs 100
# Fine-tune on your own data
pcb-dd transfer-learning --data path/to/your/data.yamlTraining auto-logs everything to logs/training/.
# Basic metrics
pcb-dd evaluate
# Confusion matrix with per-class breakdown
pcb-dd confusion
# Run everything (dataset analysis, failures, interpretability)
pcb-dd quick-analysisCheck logs/ for generated reports and visualizations.
# Export to ONNX (faster inference)
pcb-dd export-onnx
# Quantize to INT8 (even faster)
pcb-dd quantize
# Start REST API
pcb-dd api --model runs/train/baseline_yolov8n*/weights/best.pt
# Test it
curl -X POST "http://localhost:8000/detect" \
-F "file=@test_image.jpg" \
-F "conf_threshold=0.25"API docs at http://localhost:8000/docs
Cross-validation (statistical confidence):
# Quick 3-fold CV (~2 hours)
python -m training.cross_validation --quick
# Full 5-fold CV (~6 hours)
python -m training.cross_validation --folds 5Gives you proper error bars: "mAP@0.5 = 94.8% ± 2.1%"
Hyperparameter tuning:
# Quick search (10 trials, ~5 hours)
python -m training.hyperparameter --quick
# Thorough search (50 trials, ~2 days)
python -m training.hyperparameter --trials 50Results saved to logs/best_hyperparameters_*.yaml
Robustness testing:
# Test against noise, blur, occlusions, etc.
pcb-dd robustness --samples 20Failure analysis:
# See what the model gets wrong
pcb-dd failure-analysis --top 30Typical results on the included PCB dataset (your results will vary):
| Model | mAP@0.5 | Inference (ONNX) | Size |
|---|---|---|---|
| YOLOv8n (baseline) | ~99.5% | ~36ms | 6MB |
| YOLOv8n (INT8) | ~99.3% | ~12ms | 3MB |
| YOLOv8s (production) | ~99.7% | ~48ms | 22MB |
Tested on:
- NVIDIA RTX 3080 (primary testing)
- Apple M1 Pro (works, slower)
- CPU-only (works, very slow)
- AMD ROCm (untested but should work)
# Build images
cd docker
docker compose build
# Run training
docker compose run --rm trainer pcb-dd train-baseline
# Run API
docker compose up apiSee docker/README.md for GPU setup.
Project config lives in src/pcb_defect_detection/config/default.yaml. Override it:
# Option 1: Environment variable
export PCB_CONFIG=/path/to/my_config.yaml
# Option 2: Specify dataset directly
export PCB_DATASET=/path/to/data.yamlSee docs/configuration.md for details.
What to include in your report:
# 1. Train and analyze
python start.py
# 2. Run comprehensive analysis
pcb-dd quick-analysis
# 3. Optional: statistical analysis
python -m training.cross_validation --quickThen collect:
- All files from
logs/directory - Confusion matrices from
logs/confusion_matrix_*.png - Training curves from
runs/train/*/results.png - Model metrics from
logs/evaluation_*.txt
Report structure suggestion:
- Dataset analysis (class distribution, samples per class)
- Training approach (architecture choice, hyperparameters)
- Results (mAP, precision, recall with confidence intervals if you ran CV)
- Failure analysis (what the model struggles with)
- Deployment considerations (ONNX speedup, production requirements)
Installation issues:
# If install.py fails
pip install --upgrade pip
pip cache purge
python install.py --force-cpu # Try CPU-only first
# Check what's wrong
python pre_flight_check.pyTraining crashes:
# Reduce batch size
pcb-dd train-baseline --batch 4
# Disable mixed precision (for AMD/MPS)
# Edit training script, set amp=FalseImport errors:
# Make sure you installed editable
pip install -e ".[dev]"
# Check installation
pcb-dd --helpDataset not found:
# Generate sample dataset
python samples/generate_sample_dataset.py
export PCB_CONFIG=$(pwd)/samples/sample_config.yaml
# Or set your dataset path
export PCB_DATASET=/path/to/your/data.yamlMore help: docs/troubleshooting.md
- AMD ROCm support - Theoretically works but untested. You'll probably need to tweak AMP settings.
- Cross-validation - Takes hours. Start with
--quickmode. - Hyperparameter tuning - Can take days. Use
--quickfor testing. - W&B tracking - Disabled by default. Enable in config if you want it.
See CONTRIBUTING.md for coding standards.
MIT License - see LICENSE file.
Free to use for academic or commercial projects. Attribution appreciated but not required.
- YOLOv8 by Ultralytics - the actual detection engine
- Roboflow - dataset hosting and augmentation
- PCB Defects Dataset - various contributors on Roboflow Universe
- My university ML course for forcing me to actually finish this
Status: Working and tested for coursework. Production use at your own risk (but it should be fine).
Last updated: November 2025