Back to Case Studies
healthcareaiweb

Medical Form Designer

Schema-driven, AI-native form authoring system for a clinical EMR. Non-developers upload a PDF of any medical form; an LLM-vision pipeline emits a structured, editable schema that renders pixel-stable on A4 print. Replaces dozens of hand-coded form components and moves form creation out of the engineering backlog. A visual mm-native canvas lets product and clinical staff author and edit forms directly, with the AI handling first-pass extraction and natural-language region edits.

20265 min read
AI Form Designer canvas showing a medical form authored on an A4 grid with the AI extraction panel and property editor

The Challenge

A clinical EMR accumulates dozens of paper-derived forms — consent, anamnesis, referral, regulatory filings, lab orders. In the legacy system each one is a hand-coded form component, so every new or changed form goes through the engineering queue: a developer writes code, it gets reviewed, built, and deployed. Non-developers can't author forms, competitors already ship visual designers, and the hand-coded components are brittle and hard to maintain.

Medical forms are printed and filed, so output must be 1:1 on A4 with legally-mandated layouts that don't tolerate drift. Authoring has to be accessible to product and clinical staff, not just engineers. A dedicated operator was re-drawing forms from PDF by hand at roughly 8-10 forms per day, with complex forms taking about an hour each. The goal: let non-developers author forms by uploading a PDF or sketching on a visual canvas, producing a schema that renders everywhere in the EMR.

Key Constraints

  • Print fidelity: 1:1 A4 output, DPI-independent, legally-mandated layouts for regulatory forms
  • Authoring by non-developers — no JSON editing, precise but not intimidating canvas
  • LLM vision is semantically strong but spatially weak on dense tabular layouts
  • Dozens of legacy hand-coded form components targeted for replacement
  • One schema must drive view / input / print contexts everywhere in the EMR
  • Library-first design for cross-product reuse (EMR today, ERP next)

Our Approach

Built a schema-driven, library-first authoring package where a single JSON schema is the source of truth for a visual editor, a passive renderer, and the AI layer. A non-developer drops a PDF; an LLM-vision pipeline emits structured tool calls against a typed block catalog, each validated before composition. A 2D bin-packing composer assembles the blocks and fits them to the page. Output is mm-native for pixel-stable A4 printing, and forms persist as versioned schemas in the database rather than as code.

Key Technical Decisions

  • Tool-use over free-form JSON - the model calls typed block functions instead of emitting a full schema, so it can't invent invalid structures and every call is validated before anything reaches the renderer
  • AI emits content + relative ratios; code resolves absolute millimeters - sidesteps the LLM's spatial-precision ceiling, so a ratio error degrades to sub-millimeter padding instead of a half-page overflow
  • Two extraction modes - Mirror reproduces a regulatory layout 1:1, Adapted re-authors a form into the design system; same pipeline, different prompt and fit target
  • AI for structure, OCR-overlay for pixel-perfect regulatory reproduction - picking the tool by goal rather than hype
  • Deleted a multi-layer post-fix pipeline (~3k LoC) once the layering itself proved to be the problem - replaced with typed block functions, and output quality went up the same day
  • mm-native layout, not px - browser printing is mm-stable, giving DPI-independent 1:1 A4 without media-query forking
  • DOM, not Canvas - real inputs keep native accessibility, keyboard navigation, input methods, and print for free
  • Schema versioned in the database, not the codebase - adding a form is a data write, not a deploy, enabling rollback and per-clinic variation

Timeline: ~3 months from POC to MVP. Visual builder POC, pivot to an mm-grid canvas with direct-DOM drag, multi-page and undo/redo, then the AI extraction layer, the two extraction modes, and comment-driven region patches.

Implementation

Phase 1: Visual Designer Foundation

mm-native grid canvas on A4 with 1mm precision, drag/resize that bypasses React mid-gesture for smooth interaction, multi-select with group operations, undo/redo with a capped history, and a table primitive with browser-measured row heights. The JSON schema is the single source of truth feeding the editor, the renderer, and the AI tool-spec.

~4-5 weeks

Phase 2: AI Extraction Layer

A typed block catalog is exposed to LLM vision as a tool list; the model emits structured tool calls that are validated per-block against JSON Schema before composition. A 2D bin-packing composer assembles the blocks, and a slack-distribution post-process fits the result to the page by distributing the gap between growable blocks and inside table cells.

~4-6 weeks

Phase 3: Modes & Comment-Driven Patches

Mirror vs Adapted extraction modes driven by the system prompt and fit target. A third editing path beyond mouse-edit and PDF-extract: the user selects a region, writes a natural-language instruction, and the AI re-emits only that region while the rest of the schema stays byte-identical — rolled back as a single undo step.

~3-4 weeks

Technology Stack

React 18TypeScriptmm-native grid canvasZero runtime depsClaude (vision)GPT-5 (vision)Tool-use / structured outputPer-block JSON Schema validationSchema-driven renderer2D bin-packing composerPostgreSQL JSONB (versioned schemas)

Results & Impact

4-6xThroughput Lift

Form authoring from ~1 hour (manual PDF re-draw) to 10-15 minutes (AI extract + polish) on complex forms

Same dayNew-Form Lead Time

From several days to a week (developer hand-codes, reviews, deploys) to same-day authoring with no deploy

~70Components Replaced

Legacy hand-coded form components targeted for replacement by a single schema-driven renderer

~75%First-Pass Fidelity

Visual fidelity of AI extraction on dense regulatory forms, reviewed and polished from there

8 blocksTool Catalog

Typed block functions form the AI's structured-output surface, each with a tight JSON Schema input

~$0.04Cost per Extract

Per-form extraction cost on a fast vision model, low enough to iterate freely

  • Productivity: 4-6x faster form authoring, with the role shifting from manual transcription to schema review
  • Process: forms move out of the engineering backlog and into the product team's own hands
  • Maintainability: forms become versioned data, not code — rollback and per-clinic variation are a write, not a deploy
  • Reuse: library-first design enables cross-product reuse (EMR to ERP) with one engine and one schema
  • Accessibility: non-developers author and edit forms without touching code
  • Print fidelity: DPI-independent 1:1 A4 output for legally-mandated layouts

What We Learned

  • The third fix in the same layer is an architectural signal, not a bug - deleting the post-fix pipeline improved output the same day it died.
  • Tool-use beats free-form JSON for structured output - typed functions give a validation point and a smaller, sharper concept surface than a 30-type schema in the prompt.
  • AI is strong at structure, weak at exact position - emit relative ratios and resolve them to absolute millimeters in code.
  • When an LLM follows a prompt rule literally and produces broken output, the schema is missing a primitive - extend the data shape instead of piling on guard-rails.
  • Pick AI vs OCR-overlay by goal - editable, data-bound forms want AI extraction; pixel-perfect regulatory reproduction wants OCR-overlay.
  • Library-first pays off later - a schema-only contract with zero host-specific code lets the same engine lift productivity across new products without re-implementation.

Have a similar project in mind?

Let's discuss how we can help you build it

More Case Studies

DICOM Router dashboard showing real-time monitoring of file routing, DICOM job tracking, and system statistics

DICOM Router

Lightweight, self-hosted DICOM toolkit for telemedicine providers and medium-sized medical facilities. Single Go binary that receives medical images via DICOM protocol, routes them based on rules, and forwards to multiple cloud services (AWS S3, MinIO, file storage) or other DICOM servers. Features rule-based routing by modality/AE Title/tags, AI integration with callback patterns, C-FIND/C-MOVE support for querying external PACS, web management UI with audit logging for HIPAA compliance, and DICOMweb integration with OHIF viewer. Not a full PACS replacement—rather an extensible router toolkit for quickly connecting imaging devices to clouds, archives, and AI pipelines.

Speech Coach AI platform dashboard showing real-time speech analysis with pace tracking, filler word detection, and emotional tone visualization

Speech Coach

AI-powered speech coaching platform that democratizes public speaking improvement. Built with Next.js and LLM APIs, the platform analyzes speech in real-time, providing instant feedback on pace, clarity, filler words, and emotional tone. Serving 10K+ users who need affordable, 24/7 access to personalized coaching—replacing expensive $100-300/hour human coaches with AI that scales.

DICOM Routing Platform dashboard displaying medical imaging data flow, real-time monitoring of 100GB+ daily DICOM transfers, and microservices health status

DICOM Routing Platform

Enterprise medical imaging platform built for US telemedicine providers to route DICOM data from distributed clinics. Processes 100GB+ daily with zero downtime using microservices architecture (FastAPI, Redis Streams, HAProxy). Ensures HIPAA compliance, provides audit trails for healthcare regulations, and scales seamlessly from single-clinic to multi-site deployments. Features real-time monitoring dashboard and handles concurrent connections from dozens of imaging devices.

AI Education Platform interface showing Jupyter notebook environment with GPU resource monitoring, medical dataset access, and student workspace management for 40 concurrent users

AI Education Platform

Government-funded platform for healthcare AI training in Korea, replacing expensive cloud services with on-premise GPU infrastructure. Built with NestJS and FastAPI to manage 40 concurrent students across 4 Tesla V100 GPUs partitioned via NVIDIA MIG. Features isolated Jupyter environments, unlimited GPU access for medical dataset training, custom Prometheus monitoring for GPU utilization, and role-based access to shared/private datasets. Solved the challenge of providing secure, cost-effective AI education at scale.

Orthanc PACS dashboard showing DICOM studies list, patient metadata, and system monitoring with CloudWatch metrics

Orthanc PACS Deployment

Production DICOM PACS system deployed on AWS for healthcare startup. Orthanc + S3 + PostgreSQL architecture handling 1500+ DXA bone density scans with VPN-only access, automated backups, and CloudWatch monitoring. Deployed in 3 weeks with defense-in-depth security.

PACS platform study list interface showing advanced filtering, batch operations, and real-time study management with pagination and search

PACS Platform Modernization

Complete modernization of legacy PACS system handling 21TB of medical imaging data. Custom Next.js platform with Orthanc backend, PostgreSQL indexing, and Redis caching. Improved performance from 3-4 studies/second to 100 studies in under 2 seconds. Multi-site deployment with role-based access control and OHIF viewer integration.