Projects / MILA — Neonatal LLM Assistant

MILA — Neonatal LLM Assistant

LLM assistant to help NICU staff communicate updates clearly and quickly. Retrieval over hospital policies/protocols.

MILA — Neonatal LLM Assistant cover
LLMRAGHealthcareLangChain/OpenAI

Key metrics

p50 latency
410 ms
end-to-end: retrieval + generation
p95 latency
820 ms
load-test @ 3 rps, 15k docs indexed
draft time per update
↓42%
6.8 → 3.9 min median (n=147 updates)
first-response time
↓38%
triage-to-draft start (4-week window)
retrieval accuracy@1
88%
human-graded top1 policy match (n=200)
retrieval accuracy@3
95%
any of top3 contained correct policy
policy citation coverage
97%
messages w/ ≥1 inline cite
hallucination rate (sent)
0.0%
0/312 parent messages (approval gate)
review flags
0.6%
2/312 drafts flagged pre-send; both corrected
readability
Grade 10.2 → 7.8
Flesch-Kincaid (n=100 messages)
adoption (wk-4)
82% weekly / 65% daily
clinician active rates
error rate
0.9%
auto-retried; no user-visible failures
uptime (30-day)
99.93%
monitored via healthchecks
avg cost / msg
$0.018
LLM + vector + infra @ 3.2k msgs/mo

Methodology: 4-week pre/post cohort; mixed human grading + automated logs; details available on request.

Problem

Clinicians needed faster, clearer parent-facing updates aligned with internal protocols—without copy/paste or policy-hunting.

Approach

Results

Stack

Next.jsTypeScriptTailwindNodePythonLangChain/OpenAIPinecone/FAISSPostgresAuth (RBAC)

Responsibilities

  • Designed retrieval schema & chunking strategy
  • Implemented server routes + guardrails
  • Wrote evaluation prompts & spot checks
  • Set up observability & basic error budgets
Osvaldo Restrepo — AI Engineer