Distributed REST API · Production

Fault-Tolerant ML Model
Deployment Platform

Serving ML predictions at <150ms latency. Containerised with Docker, orchestrated via Kubernetes across distributed cloud infrastructure.

Flask Docker Kubernetes AWS GitHub Actions REST API CI/CD

Avg Latency

—

target <150ms

Uptime

99.5%

last 30 days

Active Pods

k8s replicas

Requests/min

—

live traffic

CI/CD Pipeline — GitHub Actions

✓ Last deploy: 2m ago

📦

Build

✓ passed

──▶

🧪

Test

✓ passed

──▶

🐳

Docker

✓ pushed

──▶

☸️

K8s Deploy

✓ live

──▶

📊

Monitor

✓ healthy

Live API Inference

POST /predict

POST api.mldeploy.prod/v1/predict

Model

Replica

Request Payload (JSON)

response — flask api server

$ Waiting for inference request...

Kubernetes Pods

3/3 RUNNING

ml-deploy-7d4f9-xk2p1

us-east-1a · docker:v2.1.0 · 14d

12%

ml-deploy-7d4f9-mn8r3

us-west-2b · docker:v2.1.0 · 14d

ml-deploy-7d4f9-qs5t7

ap-south-1 · docker:v2.1.0 · 14d

17%

System Health

CPU Usage 13%

Memory 2.1 / 4 GB

Network I/O 48 MB/s

Fault-Tolerant ML ModelDeployment Platform

Fault-Tolerant ML Model
Deployment Platform