The impute module provides strategies to handle missing values (NaN) in datasets.
SimpleImputer:
strategy='mean', 'median', 'most_frequent', or 'constant'.IterativeImputer:
KNNImputer:
MissingIndicator:
fit on training data and transform on test data.IterativeImputer, you must enable it first as it is experimental: from sklearn.experimental import enable_iterative_imputer.import numpy as np
from sklearn.experimental import enable_iterative_imputer
from sklearn.impute import SimpleImputer, IterativeImputer, KNNImputer
X = [[1, 2], [np.nan, 3], [7, 6], [4, np.nan]]
# 1. Simple Mean Imputation
imp_mean = SimpleImputer(strategy='mean')
X_simple = imp_mean.fit_transform(X)
# 2. KNN Imputation (Weights by distance)
imp_knn = KNNImputer(n_neighbors=2, weights="distance")
X_knn = imp_knn.fit_transform(X)
# 3. Iterative Imputation
imp_iter = IterativeImputer(max_iter=10, random_state=0)
X_iter = imp_iter.fit_transform(X)
Credits: This cheatsheet is based on the scikit-learn documentation and examples, which are licensed under the BSD 3-Clause License. Copyright (c) 2007 - 2026 The scikit-learn developers. All rights reserved.