Looking to implement model drift detection for your production ML models? This tutorial shows you how to catch data drift, concept drift, and prediction drift before they silently break your models โ using Evidently AI, FastAPI, and Python.
This model drift detection tutorial is designed for ML engineers who have models in production and want to monitor them proactively. No prior monitoring experience required.
Table of Contents
- 01What Is Model Drift Detection?Essential
- 02Three Types of Drift You Must MonitorTypes
- 03Why Model Drift Detection MattersImpact
- 04Statistical Methods for Drift DetectionMethods
- 05Tools for Model Drift DetectionTools
- 06Step-by-Step TutorialHands-On
- 07When Drift Is Detected โ What To DoAction
- 08Frequently Asked QuestionsFAQ

Figure 1: Model drift detection dashboard โ monitor data drift, concept drift, and prediction drift in real-time
01 What Is Model Drift Detection?
Model drift detection is the practice of monitoring machine learning models in production to identify when they start degrading due to changes in real-world data. Without proper detection, the model that worked perfectly at deployment starts making worse predictions โ often silently, without any errors or alerts.
This is the #1 reason ML projects fail in production. By the time you notice the problem, you've already lost revenue, damaged user trust, or made critical bad decisions. Implementing model drift detection is the only way to catch these issues early.
Most teams don't monitor drift. They only notice when a stakeholder complains. By then, the model has been wrong for weeks โ sometimes months. Model drift detection would have flagged this on day one.
02 Three Types of Drift You Must Monitor
Input feature distributions change over time. Your model sees data it was never trained on. Example: A fraud detection model trained on 2024 transaction patterns encounters different spending behavior in 2026.
The relationship between inputs and outputs changes. What the model learned is no longer valid. Example: A house price model trained pre-COVID fails badly after remote work changes housing demand.
The distribution of model outputs changes over time โ a leading indicator that something upstream has shifted. Example: A recommendation model starts surfacing entirely different categories.
If you're not implementing drift detection, you're flying blind. Your model is degrading right now โ you just don't know it.
03 Why Model Drift Detection Matters
HIGH IMPACT
HIGH IMPACT
MEDIUM
MEDIUM
One drift detection system can save months of engineering time and prevent millions in revenue loss. The ROI is not even close.
04 Statistical Methods for Drift Detection
Measures distribution shift between two samples. PSI < 0.1 โ stable | PSI > 0.25 โ retrain
Compares two distributions. p-value < 0.05 โ drift detected
Visual inspection of feature distributions over time. > 15โ20% shift โ investigate
05 Tools for Model Drift Detection
Python library that generates drift reports, data quality dashboards, and model performance metrics. Free, self-hosted.
Managed platform with a free tier. Monitors drift, data quality, and performance out of the box.
Self-hosted monitoring stack. Track drift metrics as time-series. Alert when drift scores cross thresholds.
Track drift scores as metrics alongside experiments. Trigger external alerts when scores exceed thresholds.
06 Step-by-Step Tutorial
This model drift detection tutorial uses Evidently AI, FastAPI, and Python to catch drift before it breaks your models.
6.1 Install Evidently AI
|
1 |
pip install evidently |
6.2 Log Predictions from Your FastAPI Endpoint
|
1 2 3 4 5 6 7 8 9 10 11 12 |
import json from datetime import datetime def log_prediction(features: dict, prediction: int, probability: float): log_entry = { "timestamp": datetime.utcnow().isoformat(), "features": features, "prediction": prediction, "probability": probability } with open("predictions.jsonl", "a") as f: f.write(json.dumps(log_entry) + "\n") |
6.3 Load Your Reference (Training) Distribution
|
1 2 3 4 5 6 |
import pandas as pd from evidently import ColumnMapping reference_data = pd.read_csv("data/training_features.csv") reference_data["prediction"] = training_predictions reference_data["probability"] = training_probabilities |
6.4 Run the Drift Detection Report
|
1 2 3 4 5 6 7 8 9 10 11 12 |
from evidently.report import Report from evidently.metric_preset import DataDriftPreset current_data = pd.read_json("predictions.jsonl", lines=True) data_drift_report = Report(metrics=[DataDriftPreset()]) data_drift_report.run( reference_data=reference_data, current_data=current_data ) data_drift_report.save_html("drift_report.html") |
6.5 Configure Alerts (Slack / PagerDuty)
|
1 2 3 4 5 6 7 8 9 10 11 12 13 |
from evidently.metrics import DatasetDriftMetric drift_metric = DatasetDriftMetric() drift_metric.reference = reference_data drift_metric.current = current_data if drift_metric.get_result().drift_detected: print("โ DRIFT DETECTED โ investigate immediately") import requests requests.post( "https://hooks.slack.com/services/YOUR_WEBHOOK", json={"text": "๐จ Model drift detected in production!"} ) |
6.6 Automate with Cron or Airflow
|
1 2 3 |
# Run drift detection every day at 9:00 AM # Add with: crontab -e 0 9 * * * python3 /opt/ml/drift_detection.py |
Run drift detection daily for revenue-critical models, weekly for others. The cost of a single missed drift event vastly outweighs the cost of running checks regularly.
07 When Drift Is Detected โ What To Do
Identify the drifting features
Open the Evidently report and look at which specific features are flagged. Sort by drift score descending.
Diagnose the root cause
Is it seasonal? A data pipeline bug? A real-world behavioral shift? Drift detection tells you what, not why.
Trigger retraining if drift is confirmed
If drift is real and significant, retrain on newer labeled data. Don't retrain blindly โ confirm you have sufficient new data first.
Recalibrate your thresholds
Update your alert thresholds based on what you learned. Some drift may be acceptable for your use case.
Document the incident
Add it to your model's changelog. Include what drifted, why, and how you fixed it.
Don't retrain reflexively. Only retrain when drift is confirmed AND you have sufficient new labeled data. Retraining on insufficient data can make things worse.
08 Frequently Asked Questions
What's the difference between data drift and concept drift?
Data drift means the input features have changed distribution. Concept drift means the relationship between inputs and outputs has changed. Data drift answers "is the world different?" Concept drift answers "has the rule changed?" You need to monitor both.
How often should I run model drift detection?
For revenue-critical models: daily. For less critical models: weekly. For batch models: after each inference batch. The cost of drift detection is minimal compared to the cost of undetected drift.
What drift detection tool should I start with?
Start with Evidently AI. It's open source, free, and integrates with any Python stack. You can generate a drift report in 5 minutes. Once you need managed infrastructure, consider WhyLabs or custom Prometheus/Grafana.
Can I use MLflow for drift detection?
MLflow doesn't have built-in drift detection, but you can log drift scores as metrics. Run a separate job that calculates drift scores and logs them to MLflow. Then set up alerts based on those metric values.
What PSI threshold should I use?
PSI < 0.1: no significant drift. PSI 0.1โ0.25: moderate drift โ investigate. PSI > 0.25: severe drift โ retrain immediately. These are industry standard thresholds from banking and insurance.
๐ External resources: Evidently AI Documentation โข MLflow โข Prometheus โข Grafana
๐ MLflow Tutorial โข ML Pipeline Tutorial โข Kubeflow vs Airflow โข Deploy ML Models with Docker & MLflow