cheatsheet

Scikit-learn Cheatsheet: Linear Models

Linear models assume that the target value is expected to be a linear combination of the input variables.

What can be done?

Key Algorithms

  1. Regression:
    • LinearRegression: Ordinary Least Squares (no penalty).
    • Ridge: L2 penalty (prevents large coefficients, handles multicollinearity).
    • Lasso: L1 penalty (forces coefficients to zero, performs feature selection).
    • ElasticNet: Combination of L1 and L2.
  2. Classification:
    • LogisticRegression: Despite name, it’s for classification.
    • SGDClassifier: Linear model optimized via Stochastic Gradient Descent (excellent for large data).
  3. Robust Regression:
    • RANSACRegressor: Fits model by ignoring outliers.
    • HuberRegressor: Less sensitive to outliers than OLS.
  4. Bayesian:
    • BayesianRidge: Ridge with automated hyperparameter estimation.

Theoretical Background

Computational Complexity

Code Snippet: Lasso and Ridge

from sklearn.linear_model import Ridge, Lasso, LogisticRegression
from sklearn.preprocessing import StandardScaler

# 1. Regression with Regularization
# alpha: Strength of penalty (higher = more regularization)
ridge = Ridge(alpha=1.0)
lasso = Lasso(alpha=0.1)

# 2. Classification with L1 (Sparse features)
# solver='liblinear' or 'saga' required for L1
log_reg = LogisticRegression(penalty='l1', solver='liblinear', C=1.0)

# 3. Large Scale learning
from sklearn.linear_model import SGDRegressor
sgd = SGDRegressor(max_iter=1000, tol=1e-3)

Credits: This cheatsheet is based on the scikit-learn documentation and examples, which are licensed under the BSD 3-Clause License. Copyright (c) 2007 - 2026 The scikit-learn developers. All rights reserved.