Covariance estimation is used to understand the relationship between variables and is a core component of many algorithms like LDA and Mahalanobis distance.
EmpiricalCovariance:
LedoitWolf, OAS):
LedoitWolf: Optimally calculates the shrinkage coefficient.OAS: Similar to Ledoit-Wolf but designed for small sample sizes.GraphicalLasso:
MinCovDet (Minimum Covariance Determinant):
import numpy as np
from sklearn.covariance import EmpiricalCovariance, MinCovDet
# Generate data with outliers
X = np.random.randn(100, 2)
X[0] = [10, 10] # Outlier
# 1. Standard Estimation (skewed by outlier)
emp_cov = EmpiricalCovariance().fit(X)
print("Empirical Location:", emp_cov.location_)
# 2. Robust Estimation (ignores outlier)
robust_cov = MinCovDet().fit(X)
print("Robust Location:", robust_cov.location_)
# 3. Get Mahalanobis Distances to detect outliers
distances = robust_cov.mahalanobis(X)
Credits: This cheatsheet is based on the scikit-learn documentation and examples, which are licensed under the BSD 3-Clause License. Copyright (c) 2007 - 2026 The scikit-learn developers. All rights reserved.