Goal
Lift the OSM-specific pipeline out of cmd/ingestion/main.go into a Source abstraction so that adding a new external source (Wheelmap, AXSMap, government open data, etc.) is mechanical and doesn't require touching the main loop.
Tracking under #10.
Current state
cmd/ingestion/main.go hardcodes the OSM path:
osm.StreamNodes → osm.Evaluate → osm.TransformNode → place.Repository.UpsertBatch.
Adding a second source today would mean duplicating the main loop or branching on flags — both bad.
Proposed shape (to validate)
// internal/sources/source.go
type Source interface {
Name() string // "osm", "wheelmap", ...
Stream(ctx context.Context, sink func(models.Place) error) error
}
- Each source lives under
internal/sources/<name>/. OSM moves there from internal/osm.
cmd/ingestion selects a source by subcommand or flag and runs it through a shared batcher into UpsertBatch.
- Sources are responsible for emitting
models.Place records; they do not see the DB.
Open questions
- Does
Source also own filtering (today: osm.Evaluate) or is that a separate stage in the pipeline?
- Where does category derivation (
osm.DeriveRank) live — in the source, or in a normalization stage between source and batcher?
- How do sources signal progress / errors uniformly (slog fields)?
- Does the interface need a
Mode (full vs. diff) or do those become two methods?
Out of scope
- Implementing a second source. This issue is the abstraction only.
- Identity resolution across sources — tracked separately.
Acceptance
internal/sources/source.go exists with the Source interface.
- OSM is migrated under
internal/sources/osm/ and implements Source.
cmd/ingestion/main.go selects sources by name and contains no OSM-specific logic.
- Existing OSM unit tests still pass; integration test (separate issue) still works.
Goal
Lift the OSM-specific pipeline out of
cmd/ingestion/main.gointo aSourceabstraction so that adding a new external source (Wheelmap, AXSMap, government open data, etc.) is mechanical and doesn't require touching the main loop.Tracking under #10.
Current state
cmd/ingestion/main.gohardcodes the OSM path:osm.StreamNodes→osm.Evaluate→osm.TransformNode→place.Repository.UpsertBatch.Adding a second source today would mean duplicating the main loop or branching on flags — both bad.
Proposed shape (to validate)
internal/sources/<name>/. OSM moves there frominternal/osm.cmd/ingestionselects a source by subcommand or flag and runs it through a shared batcher intoUpsertBatch.models.Placerecords; they do not see the DB.Open questions
Sourcealso own filtering (today:osm.Evaluate) or is that a separate stage in the pipeline?osm.DeriveRank) live — in the source, or in a normalization stage between source and batcher?Mode(full vs. diff) or do those become two methods?Out of scope
Acceptance
internal/sources/source.goexists with theSourceinterface.internal/sources/osm/and implementsSource.cmd/ingestion/main.goselects sources by name and contains no OSM-specific logic.