Implementing the AdaBoost Algorithm From Scratch

Last Updated : 04 Feb, 2025

AdaBoost means Adaptive Boosting and it is a is a powerful ensemble learning technique that combines multiple weak classifiers to create a strong classifier. It works by sequentially adding classifiers to correct the errors made by previous models giving more weight to the misclassified data points.

In this article we will learn to implement AdaBoost algorithm from scratch. By making it from scratch we will have a deep understanding of how AdaBoost works and key principles behind it.

Boosting Algorithms

Boosting Algorithms

Python implementation of AdaBoost

Python provides special packages for applying AdaBoost we will see how we can use Python for applying AdaBoost on a machine learning problem.

In this problem we are creating a synthetic dataset to check implement it.

1. Import Libraries

Let’s begin with importing important libraries that we will require to do our classification task:

import numpy as np
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

2. Defining the AdaBoost Class

class AdaBoost:
    def __init__(self, n_estimators=50):
        self.n_estimators = n_estimators
        self.alphas = []
        self.models = []

AdaBoost class is initialized with the number of weak learners (n_estimators).
self.alphas: Stores the weight of each model based on its performance.
self.models: Stores the weak classifiers (decision stumps) used in AdaBoost.

3. Training the AdaBoost Model (Fit Method)

    def fit(self, X, y):
        n_samples, n_features = X.shape
        w = np.ones(n_samples) / n_samples

n_samples, n_features: Retrieves the number of samples and features from the dataset.
w: Initializes sample weights uniformly.

        for _ in range(self.n_estimators):
            model = DecisionTreeClassifier(max_depth=1)
            model.fit(X, y, sample_weight=w)
            predictions = model.predict(X)
            err = np.sum(w * (predictions != y)) / np.sum(w)
            alpha = 0.5 * np.log((1 - err) / (err + 1e-10))
            self.alphas.append(alpha)
            self.models.append(model)
            w = w * np.exp(-alpha * y * predictions)
            w = w / np.sum(w)

err: Computes the weighted error, penalizing misclassified samples more.
alpha: Calculates the model weight based on its error. Models with lower error receive higher weight (alpha).
self.alphas.append(alpha): Appends the model’s weight to the list.
self.models.append(model): Appends the trained weak classifier to the list.
w: Updates the sample weights based on whether they were correctly or incorrectly classified

4. Making Predictions

    def predict(self, X):
        strong_preds = np.zeros(X.shape[0])
        for model, alpha in zip(self.models, self.alphas):
            strong_preds += alpha * model.predict(X)
        return np.sign(strong_preds).astype(int)

strong_preds: Stores the aggregated predictions from all weak classifiers.

5. Example Usage

if __name__ == "__main__":

    X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    adaboost = AdaBoost(n_estimators=50)
    adaboost.fit(X_train, y_train)

    predictions = adaboost.predict(X_test)

    accuracy = accuracy_score(y_test, predictions)
    print(f"Accuracy: {accuracy * 100}%")

Output:

Accuracy: 93.33%

In this case the AdaBoost model achieves an accuracy of around 93% on the synthetic dataset.

XGBoost

saurabh48782

News

Improve

Article Tags :

Practice Tags :

Machine Learning

Similar Reads

Machine Learning Algorithms

Machine learning algorithms are essentially sets of instructions that allow computers to learn from data, make predictions, and improve their performance over time without being explicitly programmed. Machine learning algorithms are broadly categorized into three types: Supervised Learning: Algorith

Top 15 Machine Learning Algorithms Every Data Scientist Should Know in 2025

Machine Learning (ML) Algorithms are the backbone of everything from Netflix recommendations to fraud detection in financial institutions. These algorithms form the core of intelligent systems, empowering organizations to analyze patterns, predict outcomes, and automate decision-making processes. Wi

ML | Stochastic Gradient Descent (SGD)

Stochastic Gradient Descent (SGD) is an optimization algorithm in machine learning, particularly when dealing with large datasets. It is a variant of the traditional gradient descent algorithm but offers several advantages in terms of efficiency and scalability, making it the go-to method for many d