Module 4: Machine Learning API Deployment Project
Deploy an insurance premium RandomForest model. Export Pipeline with joblib, serve requests via FastAPI endpoints, and connect Streamlit UI layers.
Day 7
Scikit-Learn Model Pipeline and Export
Why this matters
ML Pipeline: Sklearn pipelines bundle preprocessing + model — serialize once, load in the API process.
Train a sklearn Pipeline (preprocessing + estimator), evaluate offline, then serialize with joblib.dump.
- Keep training notebooks separate from serving code
- Version model artifacts (
pipeline_v1.joblib) - Record feature order and dtypes expected at inference
Common mistakes
- Blocking the event loop with heavy sync code in async routes.
- Returning wrong HTTP status codes (200 on validation failure).
- Shipping without request/response models for ML endpoints.
Interview checkpoints
- Q: Explain ml pipeline in one minute. A: Definition + ML deployment angle.
- Q: One FastAPI pitfall? A: Validation, async blocking, or wrong status code.
Practice
- Basic: Define ML Pipeline and give an example.
- Intermediate: Implement a minimal snippet for ML Pipeline.
- Advanced: Break it and read the OpenAPI / error response.
Recap
- You can explain ml pipeline clearly.
- You know one mistake to avoid.
- You see how this connects to the next lesson.
Next: ML Serving
Day 8
FastAPI Serve Endpoint & Streamlit Frontend
Why this matters
ML Serving: Serving predictions via FastAPI + optional Streamlit UI is the standard ML demo-to-prod path.
Load the pipeline at startup; expose POST /predict with a Pydantic request body. Optional Streamlit front-end calls the API for demos.
import joblib
from fastapi import FastAPI
from pydantic import BaseModel
app = FastAPI()
model = joblib.load("pipeline.joblib")
class Features(BaseModel):
sepal_length: float
sepal_width: float
petal_length: float
petal_width: float
@app.post("/predict")
def predict(x: Features):
pred = model.predict([[x.sepal_length, x.sepal_width, x.petal_length, x.petal_width]])
return {"class_id": int(pred[0])}Production: For heavy models, use worker processes or run predict in a thread pool to avoid blocking async workers.
Common mistakes
- Blocking the event loop with heavy sync code in async routes.
- Returning wrong HTTP status codes (200 on validation failure).
- Shipping without request/response models for ML endpoints.
Interview checkpoints
- Q: Explain ml serving in one minute. A: Definition + ML deployment angle.
- Q: One FastAPI pitfall? A: Validation, async blocking, or wrong status code.
Practice
- Basic: Define ML Serving and give an example.
- Intermediate: Implement a minimal snippet for ML Serving.
- Advanced: Break it and read the OpenAPI / error response.
Recap
- You can explain ml serving clearly.
- You know one mistake to avoid.
- You see how this connects to the next lesson.
Next: Docker Intro
