A Databricks App for clinical data classification using a Databricks ML serving endpoint. The app provides a web interface for classifying clinical data according to CIBMTR (Center for International Blood and Marrow Transplant Research) standards.
- Backend: Python FastAPI application that communicates with a Databricks serving endpoint and SQL warehouse
- Frontend: React/TypeScript UI built with Vite
- Notebooks: Setup notebooks for data, agent creation, and guidelines pipeline
- Databricks CLI installed and configured
- A Databricks workspace with access to create apps, serving endpoints, and SQL warehouses
-
Create a SQL warehouse (or use an existing one). Note the warehouse ID.
-
Create a serving endpoint named
cibmtr-classifier(or updateapp.yamlwith your endpoint name). -
Run the setup notebooks in order:
notebooks/01_setup_data_and_functions.py- Sets up required data and SQL functionsnotebooks/02_create_agent.py- Creates the ML agent for classificationnotebooks/03_guidelines_pipeline.py- Configures the guidelines processing pipeline
-
Create the Databricks App:
databricks apps create cibmtr-classifier
-
Add app resources (the app needs access to a SQL warehouse and serving endpoint):
databricks apps add-resource cibmtr-classifier \ --resource-name sql-warehouse \ --resource-type sql_warehouse \ --resource-id <WAREHOUSE_ID> \ --permission CAN_USE databricks apps add-resource cibmtr-classifier \ --resource-name serving-endpoint \ --resource-type serving_endpoint \ --resource-id cibmtr-classifier \ --permission CAN_QUERY
-
Upload the app code:
databricks workspace import-dir . /Workspace/Users/<your-email>/apps/cibmtr-classifier
Note: The app code lives at the repo root (
app.yaml,backend/,frontend/,requirements.txt,run.py). -
Deploy the app:
databricks apps deploy cibmtr-classifier \ --source-code-path /Workspace/Users/<your-email>/apps/cibmtr-classifier
The app requires two resources configured via databricks apps add-resource:
| Resource Name | Type | Permission |
|---|---|---|
sql-warehouse |
sql_warehouse |
CAN_USE |
serving-endpoint |
serving_endpoint |
CAN_QUERY |
These are also declared in app.yaml. Update the warehouse ID and endpoint name in app.yaml to match your environment.