The Challenge
A US telemedicine platform needed to handle 100GB+ of medical imaging data daily across distributed clinics. Their MVP was dying under real-world load - async FastAPI was being misused for heavy I/O operations, causing the server to choke on file uploads and concurrent requests.
This was a B2B SaaS platform serving telemedicine providers. Some customers had single small clinics with a few devices, while others operated distributed networks with dozens of imaging machines running 24/7. The unpredictable load and need for HIPAA compliance made scaling the existing setup impossible.
Key Constraints
- Must handle 100GB+ daily medical imaging data with zero data loss
- Support both single-clinic and multi-site distributed deployments
- HIPAA compliance required - full audit trails and data security
- Must work on varied infrastructure (some clinics had weak/old hardware)
- Required Python for licensing reasons (couldn't switch languages)
- Need extensibility for future AI processing nodes
Our Approach
Rebuilt from scratch using microservices architecture with FastStream and Redis Streams. Separated concerns into independent nodes: SCP receiver, SCU sender, S3 uploader, routing logic. Each node handles one responsibility and communicates via Redis Streams for auditability.
Key Technical Decisions
- FastStream over Celery - more natural with FastAPI's async patterns, easier to scale
- Redis Streams over basic queues - message history crucial for healthcare audit trails
- HAProxy load balancing - proven technology for distributing DICOM connections
- Docker deployment - easier than Kubernetes for varied clinic infrastructure
- Coordination class in Redis - prevents race conditions across distributed workers
- MinIO + S3 for storage - flexible cloud/on-premise options for different clinic needs
Timeline: 4 months (400-500 hours) from architecture design to production deployment
Implementation
Architecture Design & Planning
Multiple weekly calls with client to understand requirements. Proposed microservices approach with node-based extensibility. Got carte blanche on technical decisions.
2-3 weeksCore Routing Engine
Built foundational nodes: DICOM SCP receiver, file storage, basic routing logic. Started with synthetic test data, then 1000+ real DICOM files from various sources.
4-6 weeksMicroservices Expansion
Added SCU sender, S3 uploader, metadata extraction, coordination logic. Implemented race condition handling and worker synchronization via Redis.
6-8 weeksUI Development
Started as 'basic UI', evolved into full dashboard with Grafana-style graphs, logs, configuration screens, animations, progress bars, file management. Built with Ant Design.
4-5 weeksLoad Testing & Optimization
Wrote scripts to blast DICOM files in 20 parallel threads. System stable at 5 concurrent requests per SCP node, tested up to 20 parallel streams. Proved HAProxy scaling works.
2-3 weeksDeployment & Production
Fought Windows deployment issues, convinced client to use Ubuntu. Deployed to first clinic successfully. Now expanding to multi-tenant SaaS model.
3-4 weeksSystem Architecture

Built 7 independent microservices communicating via Redis Streams. Each node (SCP receiver, SCU sender, S3 uploader, router, metadata extractor) handles specific responsibility. Coordination class in Redis prevents race conditions when multiple workers process same DICOM study. HAProxy distributes incoming connections across SCP nodes for horizontal scaling. Docker deployment with careful resource management for clinics with weak hardware. Full monitoring stack with detailed logging for debugging distributed systems. Tested with real hospital data including edge cases and legacy DICOM formats.
Technology Stack
Results & Impact
Medical imaging data processed daily across distributed nodes
Independent nodes for receiving, routing, storing, and uploading
Per SCP node, tested up to 20 parallel streams successfully
Modular codebase enabling rapid feature additions
- Handles 100GB+ daily DICOM data with zero downtime
- System deployed in production US telemedicine clinic
- Stable performance at 5+ concurrent DICOM connections per node
- Extensible architecture - can add AI processing or custom storage without touching core
- Client expanding to multi-tenant SaaS offering for more clinics
- Full audit trail via Redis Streams for HIPAA compliance
What We Learned
- Microservices aren't free - flexibility comes at cost of deployment complexity and resource overhead
- Healthcare IT is conservative - they want modern software on old hardware without upgrades
- Redis Streams are underrated - perfect middle ground between basic queues and Kafka for medical systems
- FastStream > Celery for FastAPI - less configuration, more natural async patterns, easier scaling
- 'Simple UI' always grows - plan for feature creep from day one
- DICOM is a beast - 3000+ pages of spec, legacy formats, vendor quirks. You learn by doing.
- Extensible node-based architecture pays off - client keeps adding features, answer is always 'yes, add a node'




