ENG-1602: Add PDF extraction API route by sid597 · Pull Request #937 · DiscourseGraphs/discourse-graph

sid597 · 2026-04-02T18:19:21Z

https://www.loom.com/share/b6fc57a6040c41dabf4155f61bcb2df0

Summary by CodeRabbit

Release Notes

New Features

Added AI-powered PDF extraction API supporting Anthropic, OpenAI, and Gemini providers
Supports custom system prompts and research-focused document analysis with structured JSON output
Includes request validation and comprehensive error handling for robust operation

Multi-provider (Anthropic, OpenAI, Gemini) endpoint for extracting discourse graph nodes from uploaded PDFs.

linear · 2026-04-02T18:19:25Z

ENG-1602 API to send uploaded PDF to chosen LLM's to parse from FR1

supabase · 2026-04-02T18:19:28Z

This pull request has been ignored for the connected project zytfjzqyijgagqxrzbmz because there are no changes detected in packages/database/supabase directory. You can change this behaviour in Project Integrations Settings ↗︎.

Preview Branches by Supabase.
Learn more about Supabase Branching ↗︎.

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Widen Message.content to support multimodal content blocks and add systemPrompt/responseMimeType to Settings. Each provider's formatRequestBody now handles both text-only chat and PDF extraction, eliminating the parallel PROVIDERS block in the extraction route. OpenAI extraction switches from Responses API to Chat Completions (now supports PDF). Gemini field casing fixed to match REST API docs.

Add structured output enforcement via each provider's native mechanism: Anthropic output_config, OpenAI response_format with strict mode, Gemini responseJsonSchema. Removes prompt-based JSON instructions and response cleanup parsing since constrained decoding guarantees valid JSON.

Per AGENTS.md: functions with more than 2 parameters use named parameters via object destructuring.

sid597 · 2026-04-03T05:56:31Z

@coderabbitai full review

coderabbitai · 2026-04-03T05:56:38Z

✅ Actions performed

Full review triggered.

Inline disables for response_format, json_schema (OpenAI), and output_config (Anthropic) — external API contract names.

coderabbitai · 2026-04-03T06:00:37Z

📝 Walkthrough

Walkthrough

This PR introduces a PDF-to-AI extraction feature with a new POST API endpoint that accepts PDF data, selects LLM providers (Anthropic, OpenAI, Gemini), and extracts structured discourse graph nodes. It includes type definitions, prompt templates, response parsing logic, and updates to provider implementations to support system prompts and JSON output schemas.

Changes

Cohort / File(s)	Summary
AI Extraction Route `apps/website/app/api/ai/extract/route.ts`	New POST handler (203 lines) implementing PDF extraction with provider-specific message formatting, timeout enforcement, comprehensive error handling with distinct status codes (502/500), and response parsing via `parseExtractionResponse`.
Type Definitions & Schemas `apps/website/app/types/extraction.ts`, `apps/website/app/types/llm.ts`	Added `ExtractionRequest`, `ExtractionResult`, `ExtractedNode` schemas and types; introduced `ProviderId` union; extended `Message.content` to support `ContentBlock[]`; extended `Settings` with optional `systemPrompt` and `outputSchema`.
Prompts & Parsing `apps/website/app/prompts/extraction.ts`, `apps/website/app/utils/ai/parseExtractionResponse.ts`	Added `DEFAULT_EXTRACTION_PROMPT` with node-type instructions and constraints; `buildUserPrompt()` helper for optional research question injection; `parseExtractionResponse()` utility for JSON parsing and schema validation.
LLM Provider Updates `apps/website/app/utils/llm/providers.ts`	Modified OpenAI, Gemini, and Anthropic config logic to: prepend system messages when `systemPrompt` provided; support structured JSON output via `outputSchema`; transform message content handling for non-string blocks.

Sequence Diagram

sequenceDiagram
    participant Client
    participant APIRoute as API Route<br/>(extract/route.ts)
    participant ProviderConfig as Provider Config<br/>(providers.ts)
    participant LLMProvider as LLM Provider<br/>(Anthropic/OpenAI/Gemini)
    participant ResponseParser as Parser<br/>(parseExtractionResponse)

    Client->>APIRoute: POST with PDF, provider, model
    APIRoute->>APIRoute: Validate request<br/>against schema
    APIRoute->>APIRoute: Read provider API key<br/>from env
    APIRoute->>ProviderConfig: Build messages &<br/>settings with<br/>systemPrompt/outputSchema
    ProviderConfig->>ProviderConfig: Format provider-specific<br/>request payload
    ProviderConfig-->>APIRoute: Return formatted<br/>request config
    APIRoute->>LLMProvider: POST with timeout<br/>signal
    LLMProvider-->>APIRoute: Response text<br/>(or error)
    APIRoute->>ResponseParser: Parse extracted<br/>content JSON
    ResponseParser->>ResponseParser: Validate against<br/>ExtractionResultSchema
    ResponseParser-->>APIRoute: Typed ExtractionResult
    APIRoute-->>Client: { success, data }<br/>or { success, error }

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Move llm-api endpoints to vercel serverless #102 — Both PRs directly modify provider implementations in providers.ts; the retrieved PR establishes initial provider scaffolding while this PR extends it with system prompts and output schema handling.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check	✅ Passed	The title directly matches the main change: adding a PDF extraction API route at apps/website/app/api/ai/extract/route.ts.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

apps/website/app/utils/llm/providers.ts (1)

20-29: Add eslint-disable comments for API-required snake_case properties.

The response_format and json_schema properties are mandated by the OpenAI API. Consider adding inline eslint-disable comments to silence the warnings and document why.

🔧 Proposed fix

     ...(settings.outputSchema && {
+      // eslint-disable-next-line `@typescript-eslint/naming-convention`
       response_format: {
         type: "json_schema",
+        // eslint-disable-next-line `@typescript-eslint/naming-convention`
         json_schema: {
           name: "extraction_result",
           strict: true,
           schema: settings.outputSchema,
         },
       },
     }),

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/website/app/utils/llm/providers.ts` around lines 20 - 29, The OpenAI API
requires snake_case keys `response_format` and `json_schema` inside the object
created when `settings.outputSchema` is present; add inline eslint-disable
comments (e.g., // eslint-disable-next-line
`@typescript-eslint/naming-convention`) immediately above or inline with those
properties to suppress naming-convention errors and include a short comment
noting “required by OpenAI API” to document why the rule is disabled; update the
object around `settings.outputSchema`, `response_format`, `json_schema`, and the
`name: "extraction_result"` entry accordingly.

apps/website/app/types/extraction.ts (1)

35-55: Consider generating JSON schema from Zod to prevent drift.

The EXTRACTION_RESULT_JSON_SCHEMA manually mirrors ExtractionResultSchema. If these diverge, the LLM's structured output may not match runtime validation. Libraries like zod-to-json-schema can generate one from the other.

#!/bin/bash
# Check if zod-to-json-schema is already a dependency
rg -l "zod-to-json-schema" apps/website/package.json

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/website/app/types/extraction.ts` around lines 35 - 55, The JSON schema
constant EXTRACTION_RESULT_JSON_SCHEMA is manually duplicated and can drift from
the Zod schema (ExtractionResultSchema); replace the hand-written
EXTRACTION_RESULT_JSON_SCHEMA with a generated schema by using
zod-to-json-schema (or equivalent) to convert ExtractionResultSchema into JSON
Schema at build/runtime, update imports so ExtractionResultSchema is the sole
source of truth, and export the generated schema where
EXTRACTION_RESULT_JSON_SCHEMA was used so consumers (LLM structured output
validation) always get the schema derived from the Zod definition.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@apps/website/app/types/extraction.ts`:
- Around line 35-55: The JSON schema constant EXTRACTION_RESULT_JSON_SCHEMA is
manually duplicated and can drift from the Zod schema (ExtractionResultSchema);
replace the hand-written EXTRACTION_RESULT_JSON_SCHEMA with a generated schema
by using zod-to-json-schema (or equivalent) to convert ExtractionResultSchema
into JSON Schema at build/runtime, update imports so ExtractionResultSchema is
the sole source of truth, and export the generated schema where
EXTRACTION_RESULT_JSON_SCHEMA was used so consumers (LLM structured output
validation) always get the schema derived from the Zod definition.

In `@apps/website/app/utils/llm/providers.ts`:
- Around line 20-29: The OpenAI API requires snake_case keys `response_format`
and `json_schema` inside the object created when `settings.outputSchema` is
present; add inline eslint-disable comments (e.g., // eslint-disable-next-line
`@typescript-eslint/naming-convention`) immediately above or inline with those
properties to suppress naming-convention errors and include a short comment
noting “required by OpenAI API” to document why the rule is disabled; update the
object around `settings.outputSchema`, `response_format`, `json_schema`, and the
`name: "extraction_result"` entry accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0eb45979-82e8-48e9-a803-029f780a6309

📥 Commits

Reviewing files that changed from the base of the PR and between 700a7ab and f7c7871.

📒 Files selected for processing (6)

apps/website/app/api/ai/extract/route.ts
apps/website/app/prompts/extraction.ts
apps/website/app/types/extraction.ts
apps/website/app/types/llm.ts
apps/website/app/utils/ai/parseExtractionResponse.ts
apps/website/app/utils/llm/providers.ts

Gemini parts use { text } not { type: "text", text }. The shared textBlock was using the Anthropic/OpenAI format which Gemini rejects.

ENG-1602: Add PDF extraction API route

c106ce1

Multi-provider (Anthropic, OpenAI, Gemini) endpoint for extracting discourse graph nodes from uploaded PDFs.

devin-ai-integration bot reviewed Apr 2, 2026

View reviewed changes

This comment was marked as resolved.

Sign in to view

sid597 added 2 commits April 3, 2026 11:09

ENG-1602: Use object destructuring for buildExtractionMessages

f7c7871

Per AGENTS.md: functions with more than 2 parameters use named parameters via object destructuring.

This comment was marked as resolved.

Sign in to view

ENG-1602: Add eslint-disable for API-required snake_case fields

e4916cd

Inline disables for response_format, json_schema (OpenAI), and output_config (Anthropic) — external API contract names.

coderabbitai bot reviewed Apr 3, 2026

View reviewed changes

ENG-1602: Fix Gemini text part format

ed9bfcf

Gemini parts use { text } not { type: "text", text }. The shared textBlock was using the Anthropic/OpenAI format which Gemini rejects.

sid597 changed the title ~~fENG-1602: Add PDF extraction API route~~ ENG-1602: Add PDF extraction API route Apr 3, 2026

sid597 requested a review from mdroidian April 3, 2026 07:13

mdroidian approved these changes Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENG-1602: Add PDF extraction API route#937

ENG-1602: Add PDF extraction API route#937
sid597 wants to merge 6 commits intomainfrom
eng-1602-api-to-send-uploaded-pdf-to-chosen-llms-to-parse-from-fr1

sid597 commented Apr 2, 2026 •

edited

Loading

Uh oh!

linear bot commented Apr 2, 2026

Uh oh!

supabase bot commented Apr 2, 2026

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

sid597 commented Apr 3, 2026

Uh oh!

coderabbitai bot commented Apr 3, 2026

Uh oh!

coderabbitai bot commented Apr 3, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sid597 commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

linear bot commented Apr 2, 2026

Uh oh!

supabase bot commented Apr 2, 2026

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

sid597 commented Apr 3, 2026

Uh oh!

coderabbitai bot commented Apr 3, 2026

Uh oh!

coderabbitai bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sid597 commented Apr 2, 2026 •

edited

Loading

coderabbitai bot commented Apr 3, 2026 •

edited

Loading