The compose module provides tools to combine multiple estimators into a single one, facilitating complex workflows and preventing data leakage.
Pipeline:
ColumnTransformer:
FeatureUnion:
TransformedTargetRegressor:
stepname__parametername syntax in GridSearchCV.from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.linear_model import LogisticRegression
# 1. Define transformers for different column types
numeric_features = ["age", "fare"]
numeric_transformer = Pipeline(steps=[("scaler", StandardScaler())])
categorical_features = ["embarked", "sex"]
categorical_transformer = OneHotEncoder(handle_unknown="ignore")
# 2. Bundle them in a ColumnTransformer
preprocessor = ColumnTransformer(
transformers=[
("num", numeric_transformer, numeric_features),
("cat", categorical_transformer, categorical_features),
]
)
# 3. Create the final Pipeline
clf = Pipeline(
steps=[("preprocessor", preprocessor), ("classifier", LogisticRegression())]
)
# 4. Use it as a single estimator
clf.fit(X_train, y_train)
score = clf.score(X_test, y_test)
Credits: This cheatsheet is based on the scikit-learn documentation and examples, which are licensed under the BSD 3-Clause License. Copyright (c) 2007 - 2026 The scikit-learn developers. All rights reserved.