Loading page...
Serverless AWS pipeline that automates secure intake, parsing, storage, and notification of diagnostic laboratory CSV reports.

MediSys is a production-focused, 100% serverless pipeline designed to automate the secure intake, parsing, storage, and notification of diagnostic laboratory reports. Architected on AWS to meet modern healthcare operational needs, the system converts raw CSV uploads into structured, queryable records while preserving auditability and enforcing strict role-based access control. The upload surface is a simple, authenticated web endpoint that validates file shape and metadata before persisting objects to an SSE‑KMS encrypted S3 bucket. Each successful store emits a lightweight message to an SQS queue, decoupling ingestion from processing to absorb bursty traffic and guarantee durability. A dedicated processing Lambda consumes SQS messages and executes deterministic, idempotent parsing logic: streaming CSV rows from S3, normalizing fields (patient identifiers, test codes, timestamps, numeric/enum coercion), and performing schema validation with clear error diagnostics. Validated rows are written to DynamoDB using a single-table pattern optimized for the platform's access patterns: UploadedFiles (file-level metadata) and ParsedRecords (row-level items keyed by fileId + rowIndex). The worker atomically updates file metadata to reflect parse counts and final status (SUCCESS or FAILED) and routes failure diagnostics to a dead-letter queue for human review. EventBridge plays a central role in the notification and audit model: parsing outcomes and administrative actions (CREATE/UPDATE/DELETE clinic) are emitted as structured events and routed to curated targets. SNS topics subscribed by clinic staff and administrators provide immediate, role-appropriate notifications via email and SMS; DELETED (soft-delete) events are intentionally routed to admins only to reduce alert fatigue while preserving accountability. All events and Lambda telemetry stream to CloudWatch for centralized logging, with alarms configured for DLQ growth, parse error spikes, and abnormal latency to enable rapid operational response. Security and governance are enforced at multiple layers. Authentication uses AWS Cognito with JWTs containing custom:role and custom:clinicId claims; each API validates tokens and enforces RBAC so staff can only access their clinic's uploads while super-admins retain global visibility. IAM roles for each Lambda adhere to least privilege, scoped by resource and conditioned by clinic ID where applicable. Data is encrypted at rest (S3 SSE‑KMS, DynamoDB encryption) and in transit (TLS), and soft deletion semantics preserve historical records for compliance and forensic needs while allowing retention policies or administrative purging when required. Operational resilience is a first-class concern: SQS buffering smooths spikes, Lambda concurrency autoscaling handles throughput, and idempotent writes eliminate duplicate persistence caused by retries. DLQs capture poison messages and malformed files for offline inspection. Cost efficiency is achieved by leveraging serverless primitives—on-demand DynamoDB capacity, pay-per-request Lambda, and lifecycle rules for long-term S3 retention—while metrics-driven autoscaling and targeted alarms keep billing predictable. From a developer and deployment perspective, MediSys is CI/CD friendly: infrastructure is codified (CloudFormation / CDK / Terraform), Lambdas are built and tested with unit and integration tests (file shape and parsing scenarios), and end-to-end validation runs against a staging environment with representative datasets. The system is extensible—future enhancements include HL7/FHIR import adapters, richer PII redaction workflows, analytics ingestion for operational dashboards, and optional enterprise audit/export connectors for regulatory reporting. In summary, MediSys demonstrates how an event-driven, serverless architecture on AWS can transform manual diagnostic report workflows into a secure, scalable, and maintainable platform. It balances developer ergonomics, operational visibility, and healthcare-grade security controls to deliver a production-ready pipeline for clinics and diagnostic centers that need reliable, auditable, and timely processing of clinical test data.
Authenticated API surface with Cognito JWTs and SSE‑KMS encrypted S3 storage for all submitted files.
Upload-to-S3 triggers an SQS message to decouple ingestion from parsing and absorb burst traffic.
Lambda worker performs streaming row parsing, normalization and schema validation with idempotent writes to DynamoDB.
UploadedFiles and ParsedRecords modeled for efficient access patterns and fast, low-latency reads.
EventBridge → SNS routing provides role-scoped email/SMS notifications and centralized audit events.
Soft deletion preserves history for compliance while lifecycle rules enable controlled long-term retention.
CloudWatch logs/metrics, DLQ alarms and dashboards to surface parsing failures and operational anomalies.
Infrastructure-as-code, CI/CD pipelines and planned adapters for HL7/FHIR and PII redaction.