Skip to content

feat(video): translate a shared video's captions to the learner's language (v1.5)#647

Open
mircealungu wants to merge 2 commits into
masterfrom
feat/translated-subtitles
Open

feat(video): translate a shared video's captions to the learner's language (v1.5)#647
mircealungu wants to merge 2 commits into
masterfrom
feat/translated-subtitles

Conversation

@mircealungu
Copy link
Copy Markdown
Member

v1.5 of share-to-video — when a shared YouTube video's captions are in a different language than what the learner is studying, offer to translate them in place at the learner's CEFR level. Per-segment translation preserves the original time_start/time_end, so the existing interactive reader (tap-to-translate, bookmarks, time-synced highlight) keeps working unchanged. Audio is unaffected; only the reading surface changes. This is the video analogue of the article share-flow's translate-and-adapt option.

Pairs with zeeguu/web feat/translated-subtitles which adds the banner / progress / Original-Translated switcher to VideoPlayer.js. Independently revertible from the upstream v1 PR #635.

Changes

Commit 1 — data model + migration

  • caption_translation_set (UNIQUE(video_id, target_language_id, cefr_level)) holds the async job's status; caption_translation (one row per original Caption) holds the translated NewText. Timings stay on the parent Caption so we don't duplicate them.
  • Mirrors the DailyAudioLessonDailyAudioLessonSegment pattern already in the codebase.

Commit 2 — service + endpoints + /user_video extension

  • core/llm_services/caption_translation_service.translate_set(set_id): batches ~30 captions per Haiku call with structured JSON output (numeric markers), falls back per-caption when a batch's parsing or alignment fails so partial LLM failures degrade gracefully (untranslated lines fall back to the original text in the reader, instead of zeroing the whole set). Reuses the existing haiku_client.
  • POST /video/<id>/translate_captions — idempotent find_or_create + run_in_background(translate_set, ...), returns 202 + set dict.
  • GET /video/<id>/translate_captions/status?set_id= — for the reader's polling loop.
  • Extended GET /user_video to accept optional caption_set_id. When the set is ready and belongs to the requested video, Video.video_info substitutes translated text and retokenises in the target language. context_identifier still references the original caption id so bookmark anchoring is stable across track switches. If the set isn't ready, we silently serve the original captions — the reader's status poll drives the eventual refetch (no 4xx during a known-async wait).

Migration

tools/migrations/26-05-31-a--add_caption_translation.sql — creates both tables with the right FKs and unique keys.

Out of scope (captured for later)

A more speculative idea — generating TTS audio in the learner's language over a muted YouTube embed — was analysed and deferred: see docs/future-work/dubbed-audio-from-shared-video.md for the full feasibility + copyright write-up. Translated subtitles alone avoid the derivative-work question entirely and capture most of the UX win.

Testing

  • Compiles cleanly; models register via from zeeguu.core.model import CaptionTranslationSet, CaptionTranslation.
  • Not yet exercised end-to-end on a real video (no client wired in this PR). The companion web PR + a local run will close the loop.

🤖 Generated with Claude Code

mircealungu and others added 2 commits May 31, 2026 21:08
Tables to hold per-(video, target_language, target_cefr) translated subtitles for a shared
video. Per-segment translation preserves the original Caption.time_start/time_end so the
reader's timing/sync logic is unchanged — only the rendered text is in the learner's language.

- caption_translation_set: the bundle, with status (pending/translating/ready/error) for the
  async job, error_message, and a UNIQUE(video_id, target_language_id, cefr_level) so a
  second request for the same target deduplicates instead of re-translating.
- caption_translation: one row per original Caption inside a set, pointing at a NewText row
  for the translated content. UNIQUE(set_id, caption_id) so retried jobs resume cleanly.

Mirrors the DailyAudioLesson ↔ DailyAudioLessonSegment shape already in the codebase.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…guage

Per the v1.5 plan: when a learner shares a YouTube video whose captions are in a different
language, offer to translate the captions to the learner's language at their CEFR level,
preserving the original per-segment timing so the existing interactive reader (tap-to-translate,
bookmarks, time-synced highlight) keeps working unchanged. Audio is unaffected; only the reading
surface changes.

- New service core/llm_services/caption_translation_service.translate_set(set_id):
  batches ~30 captions per Haiku call with structured JSON output (numeric markers), falls
  back to per-caption translation when a batch's parsing or alignment fails so partial LLM
  failures degrade gracefully instead of zeroing the set. Reuses the existing haiku_client.
- New endpoints in api/endpoints/caption_translation.py:
  - POST /video/<id>/translate_captions  — find_or_create the set, kick off the background
    job via run_in_background, return 202 + set dict. Idempotent.
  - GET  /video/<id>/translate_captions/status?set_id=  — for the reader's polling loop.
- Extended /user_video to accept optional caption_set_id; when the set is ready and belongs
  to the requested video, Video.video_info substitutes translated text + retokenises in the
  target language. context_identifier still references the original caption id so bookmark
  anchoring is stable across track switches. If the set isn't ready, we silently serve the
  original captions — the reader's separate status poll drives the eventual refetch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

ArchLens detected architectural changes in the following views:
diff
diff
diff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant