Open In App

Implementing the AdaBoost Algorithm From Scratch

Last Updated : 04 Feb, 2025
Summarize
Comments
Improve
Suggest changes
Like Article
Like
Share
Report
News Follow

AdaBoost means Adaptive Boosting and it is a is a powerful ensemble learning technique that combines multiple weak classifiers to create a strong classifier. It works by sequentially adding classifiers to correct the errors made by previous models giving more weight to the misclassified data points.

In this article we will learn to implement AdaBoost algorithm from scratch. By making it from scratch we will have a deep understanding of how AdaBoost works and key principles behind it.

Boosting Algorithms

Boosting Algorithms 

Python implementation of AdaBoost 

Python provides special packages for applying AdaBoost we will see how we can use Python for applying AdaBoost on a machine learning problem. 

In this problem we are creating a synthetic dataset to check implement it.

1. Import Libraries

Let’s begin with importing important libraries that we will require to do our classification task:

import numpy as np
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

2. Defining the AdaBoost Class

class AdaBoost:
    def __init__(self, n_estimators=50):
        self.n_estimators = n_estimators
        self.alphas = []
        self.models = []
  • AdaBoost class is initialized with the number of weak learners (n_estimators).
  • self.alphas: Stores the weight of each model based on its performance.
  • self.models: Stores the weak classifiers (decision stumps) used in AdaBoost.

 3. Training the AdaBoost Model (Fit Method)

    def fit(self, X, y):
        n_samples, n_features = X.shape
        w = np.ones(n_samples) / n_samples
  • n_samples, n_features: Retrieves the number of samples and features from the dataset.
  • w: Initializes sample weights uniformly.
        for _ in range(self.n_estimators):
            model = DecisionTreeClassifier(max_depth=1)
            model.fit(X, y, sample_weight=w)
            predictions = model.predict(X)
            err = np.sum(w * (predictions != y)) / np.sum(w)
            alpha = 0.5 * np.log((1 - err) / (err + 1e-10))
            self.alphas.append(alpha)
            self.models.append(model)
            w = w * np.exp(-alpha * y * predictions)
            w = w / np.sum(w)
  • err: Computes the weighted error, penalizing misclassified samples more.
  • alpha: Calculates the model weight based on its error. Models with lower error receive higher weight (alpha).
  • self.alphas.append(alpha): Appends the model’s weight to the list.
  • self.models.append(model): Appends the trained weak classifier to the list.
  • w: Updates the sample weights based on whether they were correctly or incorrectly classified

4. Making Predictions

    def predict(self, X):
        strong_preds = np.zeros(X.shape[0])
        for model, alpha in zip(self.models, self.alphas):
            strong_preds += alpha * model.predict(X)
        return np.sign(strong_preds).astype(int)
  • strong_preds: Stores the aggregated predictions from all weak classifiers.

5. Example Usage

if __name__ == "__main__":

    X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    adaboost = AdaBoost(n_estimators=50)
    adaboost.fit(X_train, y_train)

    predictions = adaboost.predict(X_test)

    accuracy = accuracy_score(y_test, predictions)
    print(f"Accuracy: {accuracy * 100}%")

Output:

Accuracy: 93.33%

In this case the AdaBoost model achieves an accuracy of around 93% on the synthetic dataset.



Next Article

Similar Reads

three90RightbarBannerImg