We currently use a machine with a set of cron jobs running on it to power our ingestion layer. This was fine at the start but is starting to become difficult to manage and as the number of use cases scales this approach does not offer ease of use, scalability or a good user experience.
MVP Requirements:
- Tasks to be defined as python code/packages
- Allows a cron-like timed schedule to be created
- A REST api to trigger jobs
- Allows one-off jobs triggered manually on request
- A web-based UI
- Container-based deployment
Nice to haves:
Some options to consider:
- Apache Airflow
- Dagster
- Prefect
- Kestra
- Luigi
We currently use a machine with a set of cron jobs running on it to power our ingestion layer. This was fine at the start but is starting to become difficult to manage and as the number of use cases scales this approach does not offer ease of use, scalability or a good user experience.
MVP Requirements:
Nice to haves:
Some options to consider: