// Architecture Blueprint v1.0

Movie Dialog GPT

end-to-end · ml-powered · production-grade · zero cost

System Architecture
── CLIENT LAYER ──
React UIChat Interface
NginxReverse Proxy
── API LAYER ──
FastAPIREST Endpoints
WebSocketStreaming
Auth MiddlewareJWT
── MESSAGE QUEUE ──
KafkaChat Events
RedisSession Cache
── ML LAYER ──
Model ServicePyTorch → Transformer
MLflowExperiment Tracking
Apache SparkData Processing
── STORAGE LAYER ──
PostgreSQLUsers, Conversations
MongoDBRaw Dialog Corpus
MinIO / S3Model Artifacts
── INFRA LAYER ──
DockerContainerization
KubernetesOrchestration (k3s)
GitHub ActionsCI/CD
GrafanaObservability
Component Breakdown
Frontend & API
React free
Chat UI with streaming responses. Think iMessage meets movie aesthetic. Vite for dev, served via Nginx.
FastAPI
REST + WebSocket. Phases: mock responses → real model. Pydantic for validation, async everywhere.
Nginx
Reverse proxy, SSL termination, static file serving. Free via Docker container.
Data & ML Pipeline
Apache Spark free
Process Movie Dialog Corpus. Clean & tokenize at scale. Use PySpark locally or on free Colab/Databricks tier.
MongoDB
Store raw corpus, conversation pairs. Atlas free tier: 512MB — enough for corpus.
Kafka
Chat message queue. User sends message → Kafka topic → model consumer → response. Confluent free tier.
Redis
Cache conversation context. Session state, rate limiting. Upstash free tier: 10k commands/day.
ML & Observability
MLflow free
Track experiments, model versions, hyperparams. Self-host locally. Log LSTM → Transformer progression.
PyTorch Model
Phase 1: mock. Phase 2: LSTM (you've done this!). Phase 3: Transformer/GPT-2 fine-tune on corpus.
Grafana + Prometheus
Monitor API latency, model inference time, Kafka lag, Redis hits. Grafana Cloud free tier: 10k metrics.
PostgreSQL
Users, conversation history, feedback ratings. Supabase free tier: 500MB. Alembic for migrations.
Request Flow — User Sends a Message
💬
User
React UI
→ WS →
FastAPI
Validate + Auth
→ pub →
📨
Kafka
chat.messages
→ sub →
🧠
Model Worker
PyTorch
→ cache →
Redis
Context Cache
→ stream →
💬
Response
Token Stream
Training Data Pipeline
📁
Kaggle Corpus
Raw CSV
🔥
PySpark
Clean + Tokenize
🍃
MongoDB
Processed Pairs
🏋️
Train Loop
PyTorch
📊
MLflow
Log Metrics
📦
MinIO
Model Artifact
CI/CD Pipeline
📝
git push
feature branch
🔄
GH Actions
Test + Lint
🐳
Docker Build
GHCR Registry
☸️
k3s Deploy
Rolling Update
📡
Grafana
Health Check
Recommended Repository Structure
movie-dialog-gpt/
.github/
workflows/
ci.yml # test + lint on PR
deploy.yml # build + push image → k8s deploy
ml-pipeline.yml # trigger training run

backend/ # FastAPI service
app/
main.py # app init, lifespan, routers
config.py # pydantic-settings env config
routers/ # chat.py, health.py, auth.py
services/ # model_service.py, kafka_service.py
models/ # SQLAlchemy ORM models
schemas/ # Pydantic request/response schemas
db/ # session.py, alembic migrations
Dockerfile
requirements.txt

frontend/ # React + Vite
src/
components/ # ChatWindow, MessageBubble, etc.
hooks/ # useWebSocket, useChat
store/ # Zustand state management
Dockerfile
nginx.conf

ml/ # All ML code
data/
ingest.py # Download from Kaggle API
preprocess.py # PySpark cleaning pipeline
dataset.py # PyTorch Dataset class
models/
lstm_model.py # Your existing work ✅
transformer.py # Custom GPT-like model
gpt2_finetune.py # HuggingFace fine-tune
train.py # Training loop + MLflow logging
evaluate.py # BLEU, perplexity metrics
serve.py # Model as inference service

infra/ # All infra-as-code
k8s/
backend-deployment.yaml
frontend-deployment.yaml
kafka-statefulset.yaml
docker-compose.yml # Full local dev stack
docker-compose.dev.yml # Hot-reload override

monitoring/
prometheus.yml
grafana/dashboards/ # JSON dashboard configs

docs/
architecture.md
local-setup.md
ml-training-guide.md
api-reference.md # Auto-gen from FastAPI /docs

docker-compose.yml # Bring up everything locally
Makefile # make dev, make test, make train
README.md
.env.example
Development Phases
01
Foundation
Scaffold + Mock Backend
Set up monorepo, FastAPI with mock responses, React chat UI. Docker-compose to run everything. Basic PostgreSQL schema for users + conversations. GitHub Actions for lint/test on push.
FastAPI React + Vite PostgreSQL Docker Compose GitHub Actions
02
Data Pipeline
Corpus Ingestion & Processing
Download Movie Dialog Corpus via Kaggle API. Run PySpark cleaning pipeline — remove HTML, normalize, extract Q&A pairs. Store processed pairs in MongoDB. This gives you a proper ML dataset.
PySpark MongoDB Atlas Kaggle API Python scripts
03
Messaging Layer
Kafka + Redis Integration
Replace direct FastAPI → model call with async Kafka queue. Messages published to chat.input topic, model worker consumes and publishes to chat.output. Redis for caching conversation context window (last N turns).
Kafka (Confluent) Redis (Upstash) aiokafka aioredis
04
ML Training
Train & Track Your Model
Start with your LSTM next-word predictor on the corpus. Log everything to MLflow. Progress to seq2seq. Then attempt Transformer from scratch or fine-tune GPT-2 on corpus (HuggingFace). Save best model to MinIO/local.
PyTorch MLflow HuggingFace Google Colab MinIO
05
Observability
Monitoring + Dashboards
Add Prometheus metrics to FastAPI (latency, errors, model inference time). Set up Grafana dashboards for API health, Kafka lag, Redis hit rate, model performance over time. Alerting on key thresholds.
Prometheus Grafana Cloud prometheus-fastapi-instrumentator
06
Deployment
Kubernetes + Cloud Deploy
Use k3s (lightweight K8s) locally first. Deploy to Oracle Cloud Free Tier (2 VMs, 4 ARM cores, 24GB RAM — completely free forever). Full CI/CD pipeline: push to main → build image → push to GHCR → kubectl rollout.
k3s / k8s Oracle Cloud Free GHCR Helm Charts Let's Encrypt SSL
Free Tier Stack — Zero Cost
Infrastructure
Cloud Hosting Oracle Cloud Free ✓
Container Registry GitHub GHCR ✓
CI/CD GitHub Actions ✓
Kubernetes k3s self-hosted ✓
SSL Cert Let's Encrypt ✓
Data & Messaging
MongoDB Atlas Free 512MB ✓
PostgreSQL Supabase Free ✓
Redis Upstash Free ✓
Kafka Confluent Free ✓
Object Storage MinIO self-hosted ✓
ML & Monitoring
MLflow Self-hosted ✓
Grafana Cloud Free Tier ✓
GPU Training Google Colab Free ✓
Model Hosting Self-hosted ✓
Watch Out For (Costs $)
AWS / GCP / Azure Avoid for now ⚠
OpenAI API Not needed ⚠
Datadog Use Grafana instead ⚠
Confluent (beyond free) Or self-host Kafka ⚠
Project Checklist — Click to Track Progress
Initialize monorepo with backend/, frontend/, ml/, infra/ structure
PHASE 1 · Foundation
FastAPI app with /chat endpoint returning mock movie responses
PHASE 1 · Foundation
React chat UI with WebSocket connection to backend
PHASE 1 · Foundation
docker-compose.yml that starts everything with one command
PHASE 1 · Foundation
GitHub Actions CI: lint + pytest on every PR
PHASE 1 · Foundation
Download Movie Dialog Corpus via Kaggle API
PHASE 2 · Data Pipeline
PySpark preprocessing: clean, normalize, extract Q&A pairs
PHASE 2 · Data Pipeline
Load processed data into MongoDB Atlas free tier
PHASE 2 · Data Pipeline
Integrate Kafka: FastAPI → topic → model consumer
PHASE 3 · Messaging
Redis context cache: store last N conversation turns
PHASE 3 · Messaging
Train LSTM model on corpus, log to MLflow
PHASE 4 · ML Training
Build custom Transformer / fine-tune GPT-2 on dialog corpus
PHASE 4 · ML Training
Swap mock responses for real model inference in backend
PHASE 4 · ML Training
Prometheus metrics in FastAPI + Grafana dashboard
PHASE 5 · Observability
Deploy to Oracle Cloud Free Tier with k3s
PHASE 6 · Deployment
CD pipeline: git push → Docker build → k8s rolling deploy
PHASE 6 · Deployment
Write docs/: architecture, local setup, API reference, ML guide
Documentation