Skip to content

philsalm/cibmtr-classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CIBMTR Classifier

A Databricks App for clinical data classification using a Databricks ML serving endpoint. The app provides a web interface for classifying clinical data according to CIBMTR (Center for International Blood and Marrow Transplant Research) standards.

Architecture

  • Backend: Python FastAPI application that communicates with a Databricks serving endpoint and SQL warehouse
  • Frontend: React/TypeScript UI built with Vite
  • Notebooks: Setup notebooks for data, agent creation, and guidelines pipeline

Deploying to a New Databricks Workspace

Prerequisites

  • Databricks CLI installed and configured
  • A Databricks workspace with access to create apps, serving endpoints, and SQL warehouses

Steps

  1. Create a SQL warehouse (or use an existing one). Note the warehouse ID.

  2. Create a serving endpoint named cibmtr-classifier (or update app.yaml with your endpoint name).

  3. Run the setup notebooks in order:

    • notebooks/01_setup_data_and_functions.py - Sets up required data and SQL functions
    • notebooks/02_create_agent.py - Creates the ML agent for classification
    • notebooks/03_guidelines_pipeline.py - Configures the guidelines processing pipeline
  4. Create the Databricks App:

    databricks apps create cibmtr-classifier
  5. Add app resources (the app needs access to a SQL warehouse and serving endpoint):

    databricks apps add-resource cibmtr-classifier \
      --resource-name sql-warehouse \
      --resource-type sql_warehouse \
      --resource-id <WAREHOUSE_ID> \
      --permission CAN_USE
    
    databricks apps add-resource cibmtr-classifier \
      --resource-name serving-endpoint \
      --resource-type serving_endpoint \
      --resource-id cibmtr-classifier \
      --permission CAN_QUERY
  6. Upload the app code:

    databricks workspace import-dir . /Workspace/Users/<your-email>/apps/cibmtr-classifier

    Note: The app code lives at the repo root (app.yaml, backend/, frontend/, requirements.txt, run.py).

  7. Deploy the app:

    databricks apps deploy cibmtr-classifier \
      --source-code-path /Workspace/Users/<your-email>/apps/cibmtr-classifier

App Resources

The app requires two resources configured via databricks apps add-resource:

Resource Name Type Permission
sql-warehouse sql_warehouse CAN_USE
serving-endpoint serving_endpoint CAN_QUERY

These are also declared in app.yaml. Update the warehouse ID and endpoint name in app.yaml to match your environment.

About

CIBMTR Classifier - Databricks App for clinical data classification using ML serving endpoint

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors