Skip to main content

Dual-Stream Architecture

To reconcile messy, unstructured input with safe, data-dense visualization, Corvus employs a Dual-Stream Ingestion Engine. This design separates narrative reasoning (which allows for some "fuzziness" and interpretation) from critical quantitative metrics (which must be exact and hallucination-free).

The Resilience Cascade

We implement a Resilience Cascade for document ingestion that prioritizes:

  1. Layout-aware parsing for clinical notes to preserve spatial context.
  2. Scholarly structure extraction for academic PDFs to capture section hierarchies.
  3. OCR Fallback only for flattened assets.

This preserves semantic boundaries (tables, headers) that are typically lost in standard text extraction.

Stream A: Narrative Synthesis

Goal: Transform unstructured narrative text into coherent summaries and plan suggestions.

  • Mechanism: Uses structured clinical templates to produce consistent drafts.
  • RAG Integration: When a query needs evidence, the system uses hybrid retrieval (keyword + semantic) to find high-quality sources and return citations.
  • Agents: Driven by a clinical reasoning role and a research role.

Stream B: Structured Clinical Signals

Goal: Extract high-yield quantitative markers (labs, vitals) with 0% hallucination rate.

  • Mechanism: Deterministic extractors and schema-constrained pipelines.
  • Type Safety: Enforces strict schema compliance so only validated key-value pairs are accepted (e.g., numeric labs/vitals), rejecting malformed outputs.
  • Outputs:
    • Normalized labs/vitals.
    • Sentinel flags (e.g., rising lactate).
    • Checklist-ready plan items.
    • Conflict flags (contradictions between narrative and labs).

Visualizing the Dual Streams

This dual approach allows Corvus to provide the "best of both worlds": the flexible reasoning of LLMs for text, and the rigid safety of traditional software for numbers.