Movie Dialog GPT — Architecture Blueprint

System Architecture

── CLIENT LAYER ──

React UIChat Interface

NginxReverse Proxy

↓

── API LAYER ──

FastAPIREST Endpoints

WebSocketStreaming

Auth MiddlewareJWT

↓

── MESSAGE QUEUE ──

KafkaChat Events

RedisSession Cache

↓

── ML LAYER ──

Model ServicePyTorch → Transformer

MLflowExperiment Tracking

Apache SparkData Processing

↓

── STORAGE LAYER ──

PostgreSQLUsers, Conversations

MongoDBRaw Dialog Corpus

MinIO / S3Model Artifacts

↓

── INFRA LAYER ──

DockerContainerization

KubernetesOrchestration (k3s)

GitHub ActionsCI/CD

GrafanaObservability

Component Breakdown

Frontend & API

React free

Chat UI with streaming responses. Think iMessage meets movie aesthetic. Vite for dev, served via Nginx.

FastAPI

REST + WebSocket. Phases: mock responses → real model. Pydantic for validation, async everywhere.

Nginx

Reverse proxy, SSL termination, static file serving. Free via Docker container.

Data & ML Pipeline

Apache Spark free

Process Movie Dialog Corpus. Clean & tokenize at scale. Use PySpark locally or on free Colab/Databricks tier.

MongoDB

Store raw corpus, conversation pairs. Atlas free tier: 512MB — enough for corpus.

Kafka

Chat message queue. User sends message → Kafka topic → model consumer → response. Confluent free tier.

Redis

Cache conversation context. Session state, rate limiting. Upstash free tier: 10k commands/day.

ML & Observability

MLflow free

Track experiments, model versions, hyperparams. Self-host locally. Log LSTM → Transformer progression.

PyTorch Model

Phase 1: mock. Phase 2: LSTM (you've done this!). Phase 3: Transformer/GPT-2 fine-tune on corpus.

Grafana + Prometheus

Monitor API latency, model inference time, Kafka lag, Redis hits. Grafana Cloud free tier: 10k metrics.

PostgreSQL

Users, conversation history, feedback ratings. Supabase free tier: 500MB. Alembic for migrations.

Request Flow — User Sends a Message

💬

User

React UI

→ WS →

⚡

FastAPI

Validate + Auth

→ pub →

📨

Kafka

chat.messages

→ sub →

🧠

Model Worker

PyTorch

→ cache →

⚡

Redis

Context Cache

→ stream →

💬

Response

Token Stream

Training Data Pipeline

📁

Kaggle Corpus

Raw CSV

→

🔥

PySpark

Clean + Tokenize

→

🍃

MongoDB

Processed Pairs

→

🏋️

Train Loop

PyTorch

→

📊

MLflow

Log Metrics

→

📦

MinIO

Model Artifact

CI/CD Pipeline

📝

git push

feature branch

→

🔄

GH Actions

Test + Lint

→

🐳

Docker Build

GHCR Registry

→

☸️

k3s Deploy

Rolling Update

→

📡

Grafana

Health Check

Recommended Repository Structure

movie-dialog-gpt/

.github/

workflows/

ci.yml # test + lint on PR

deploy.yml # build + push image → k8s deploy

ml-pipeline.yml # trigger training run

backend/ # FastAPI service

app/

main.py # app init, lifespan, routers

config.py # pydantic-settings env config

routers/ # chat.py, health.py, auth.py

services/ # model_service.py, kafka_service.py

models/ # SQLAlchemy ORM models

schemas/ # Pydantic request/response schemas

db/ # session.py, alembic migrations

Dockerfile

requirements.txt

frontend/ # React + Vite

src/

components/ # ChatWindow, MessageBubble, etc.

hooks/ # useWebSocket, useChat

store/ # Zustand state management

Dockerfile

nginx.conf

ml/ # All ML code

data/

ingest.py # Download from Kaggle API

preprocess.py # PySpark cleaning pipeline

dataset.py # PyTorch Dataset class

models/

lstm_model.py # Your existing work ✅

transformer.py # Custom GPT-like model

gpt2_finetune.py # HuggingFace fine-tune

train.py # Training loop + MLflow logging

evaluate.py # BLEU, perplexity metrics

serve.py # Model as inference service

infra/ # All infra-as-code

k8s/

backend-deployment.yaml

frontend-deployment.yaml

kafka-statefulset.yaml

docker-compose.yml # Full local dev stack

docker-compose.dev.yml # Hot-reload override

monitoring/

prometheus.yml

grafana/dashboards/ # JSON dashboard configs

docs/

architecture.md

local-setup.md

ml-training-guide.md

api-reference.md # Auto-gen from FastAPI /docs

docker-compose.yml # Bring up everything locally

Makefile # make dev, make test, make train

README.md

.env.example

Development Phases

01

Foundation

Scaffold + Mock Backend

Set up monorepo, FastAPI with mock responses, React chat UI. Docker-compose to run everything. Basic PostgreSQL schema for users + conversations. GitHub Actions for lint/test on push.

FastAPI React + Vite PostgreSQL Docker Compose GitHub Actions

02

Data Pipeline

Corpus Ingestion & Processing

Download Movie Dialog Corpus via Kaggle API. Run PySpark cleaning pipeline — remove HTML, normalize, extract Q&A pairs. Store processed pairs in MongoDB. This gives you a proper ML dataset.

PySpark MongoDB Atlas Kaggle API Python scripts

03

Messaging Layer

Kafka + Redis Integration

Replace direct FastAPI → model call with async Kafka queue. Messages published to chat.input topic, model worker consumes and publishes to chat.output. Redis for caching conversation context window (last N turns).

Kafka (Confluent) Redis (Upstash) aiokafka aioredis

04

ML Training

Train & Track Your Model

Start with your LSTM next-word predictor on the corpus. Log everything to MLflow. Progress to seq2seq. Then attempt Transformer from scratch or fine-tune GPT-2 on corpus (HuggingFace). Save best model to MinIO/local.

PyTorch MLflow HuggingFace Google Colab MinIO

05

Observability

Monitoring + Dashboards

Add Prometheus metrics to FastAPI (latency, errors, model inference time). Set up Grafana dashboards for API health, Kafka lag, Redis hit rate, model performance over time. Alerting on key thresholds.

Prometheus Grafana Cloud prometheus-fastapi-instrumentator

06

Deployment

Kubernetes + Cloud Deploy

Use k3s (lightweight K8s) locally first. Deploy to Oracle Cloud Free Tier (2 VMs, 4 ARM cores, 24GB RAM — completely free forever). Full CI/CD pipeline: push to main → build image → push to GHCR → kubectl rollout.

k3s / k8s Oracle Cloud Free GHCR Helm Charts Let's Encrypt SSL

Free Tier Stack — Zero Cost

Infrastructure

Cloud Hosting Oracle Cloud Free ✓

Container Registry GitHub GHCR ✓

CI/CD GitHub Actions ✓

Kubernetes k3s self-hosted ✓

SSL Cert Let's Encrypt ✓

Data & Messaging

MongoDB Atlas Free 512MB ✓

PostgreSQL Supabase Free ✓

Redis Upstash Free ✓

Kafka Confluent Free ✓

Object Storage MinIO self-hosted ✓

ML & Monitoring

MLflow Self-hosted ✓

Grafana Cloud Free Tier ✓

GPU Training Google Colab Free ✓

Model Hosting Self-hosted ✓

Watch Out For (Costs $)

AWS / GCP / Azure Avoid for now ⚠

OpenAI API Not needed ⚠

Datadog Use Grafana instead ⚠

Confluent (beyond free) Or self-host Kafka ⚠

Project Checklist — Click to Track Progress

Initialize monorepo with backend/, frontend/, ml/, infra/ structure

PHASE 1 · Foundation

FastAPI app with /chat endpoint returning mock movie responses

PHASE 1 · Foundation

React chat UI with WebSocket connection to backend

PHASE 1 · Foundation

docker-compose.yml that starts everything with one command

PHASE 1 · Foundation

GitHub Actions CI: lint + pytest on every PR

PHASE 1 · Foundation

Download Movie Dialog Corpus via Kaggle API

PHASE 2 · Data Pipeline

PySpark preprocessing: clean, normalize, extract Q&A pairs

PHASE 2 · Data Pipeline

Load processed data into MongoDB Atlas free tier

PHASE 2 · Data Pipeline

Integrate Kafka: FastAPI → topic → model consumer

PHASE 3 · Messaging

Redis context cache: store last N conversation turns

PHASE 3 · Messaging

Train LSTM model on corpus, log to MLflow

PHASE 4 · ML Training

Build custom Transformer / fine-tune GPT-2 on dialog corpus

PHASE 4 · ML Training

Swap mock responses for real model inference in backend

PHASE 4 · ML Training

Prometheus metrics in FastAPI + Grafana dashboard

PHASE 5 · Observability

Deploy to Oracle Cloud Free Tier with k3s

PHASE 6 · Deployment

CD pipeline: git push → Docker build → k8s rolling deploy

PHASE 6 · Deployment

Write docs/: architecture, local setup, API reference, ML guide

Documentation