cerlymarco · Aug 27, 2021 · Aug 27, 2021 · Aug 28, 2021 · Aug 31, 2021 · Aug 31, 2021
diff --git a/README.md b/README.md
@@ -1,13 +1,17 @@
 # linear-tree
 A python library to build Model Trees with Linear Models at the leaves.
 
+linear-tree provides also the implementations of _LinearForest_ and _LinearBoost_ inspired from [these works](https://github.com/cerlymarco/linear-tree#references).
+
 ## Overview
 **Linear Trees** combine the learning ability of Decision Tree with the predictive and explicative power of Linear Models. 
 Like in tree-based algorithms, the data are split according to simple decision rules. The goodness of slits is evaluated in gain terms fitting Linear Models in the nodes. This implies that the models in the leaves are linear instead of constant approximations like in classical Decision Trees. 
 
-**Linear Boosting**, available in linear-tree package, is a two stage learning process. Firstly, a linear model is trained on the initial dataset to obtains predictions. Secondly, the residuals of the previous step are modeled with a decision tree using all the available features. The tree identifies the path leading to highest error (i.e. the worst leaf). The leaf contributing to the error the most is used to generate a new binary feature to be used in the first stage. The iterations continue until a certain stopping criterion is met.
+**Linear Forests** generalizes the well known Random Forests by combining Linear Models with the same Random Forests. The key idea is to use the strength of Linear Models to improve the nonparametric learning ability of tree-based algorithms. Firstly, a Linear Model is fitted on the whole dataset, then a Random Forest is trained on the same dataset but using the residuals of the previous steps as target. The final predictions are the sum of the raw linear predictions and the residuals modeled by the Random Forest.
+
+**Linear Boosting** is a two stage learning process. Firstly, a linear model is trained on the initial dataset to obtains predictions. Secondly, the residuals of the previous step are modeled with a decision tree using all the available features. The tree identifies the path leading to highest error (i.e. the worst leaf). The leaf contributing to the error the most is used to generate a new binary feature to be used in the first stage. The iterations continue until a certain stopping criterion is met.
 
-**linear-tree is developed to be fully integrable with scikit-learn**. ```LinearTreeRegressor``` and ```LinearTreeClassifier``` are provided as scikit-learn _BaseEstimator_. They are wrappers that build a decision tree on the data fitting a linear estimator from ```sklearn.linear_model```. ```LinearBoostRegressor``` and ```LinearBoostClassifier``` are available also as _TransformerMixin_ in order to be integrated, in any pipeline, also for  automated features engineering. All the models available in [sklearn.linear_model](https://scikit-learn.org/stable/modules/classes.html#module-sklearn.linear_model) can be used as base learner. 
+**linear-tree is developed to be fully integrable with scikit-learn**. ```LinearTreeRegressor``` and ```LinearTreeClassifier``` are provided as scikit-learn _BaseEstimator_ to build a decision tree using linear estimators. ```LinearForestRegressor``` and ```LinearForestClassifier``` use the _RandomForest_ from sklearn to model residuals. ```LinearBoostRegressor``` and ```LinearBoostClassifier``` are available also as _TransformerMixin_ in order to be integrated, in any pipeline, also for  automated features engineering. All the models available in [sklearn.linear_model](https://scikit-learn.org/stable/modules/classes.html#module-sklearn.linear_model) can be used as base learner. 
 
 ## Installation
 ```shell
@@ -19,6 +23,7 @@ The module depends on NumPy, SciPy and Scikit-Learn (>=0.23.0). Python 3.6 or ab
 - [Linear Tree: the perfect mix of Linear Model and Decision Tree](https://towardsdatascience.com/linear-tree-the-perfect-mix-of-linear-model-and-decision-tree-2eaed21936b7)
 - [Model Tree: handle Data Shifts mixing Linear Model and Decision Tree](https://towardsdatascience.com/model-tree-handle-data-shifts-mixing-linear-model-and-decision-tree-facfd642e42b)
 - [Explainable AI with Linear Trees](https://towardsdatascience.com/explainable-ai-with-linear-trees-7e30a6f067d7)
+- [Improve Linear Regression for Time Series Forecasting](https://towardsdatascience.com/improve-linear-regression-for-time-series-forecasting-e36f3c3e3534#a80b-b6010ccb1c21)
 
 ## Usage
 ##### Linear Tree Regression
@@ -43,6 +48,28 @@ X, y = make_classification(n_samples=100, n_features=4,
 clf = LinearTreeClassifier(base_estimator=RidgeClassifier())
 clf.fit(X, y)
 ```
+##### Linear Forest Regression
+```python
+from sklearn.linear_model import LinearRegression
+from lineartree import LinearForestRegressor
+from sklearn.datasets import make_regression
+X, y = make_regression(n_samples=100, n_features=4,
+                       n_informative=2, n_targets=1,
+                       random_state=0, shuffle=False)
+regr = LinearForestRegressor(base_estimator=LinearRegression())
+regr.fit(X, y)
+```
+##### Linear Forest Classification
+```python
+from sklearn.linear_model import LinearRegression
+from lineartree import LinearForestClassifier
+from sklearn.datasets import make_classification
+X, y = make_classification(n_samples=100, n_features=4,
+                           n_informative=2, n_redundant=0,
+                           random_state=0, shuffle=False)
+clf = LinearForestClassifier(base_estimator=LinearRegression())
+clf.fit(X, y)
+```
 ##### Linear Boosting Regression
 ```python
 from sklearn.linear_model import LinearRegression
@@ -87,7 +114,14 @@ Extract and examine coefficients at the leaves:
 
 ![leaf coefficients](https://raw.githubusercontent.com/cerlymarco/linear-tree/master/imgs/leaf_coefficients.png)
 
-Impact of the features automatically generated with linear boosting:
+Impact of the features automatically generated with Linear Boosting:
 
 ![linear_boost_importances](https://raw.githubusercontent.com/cerlymarco/linear-tree/master/imgs/linear_boost_importances.png)
 
+Comparing predictions of Linear Forest and Random Forest:
+
+![linear_forest_predictions](https://raw.githubusercontent.com/cerlymarco/linear-tree/master/imgs/linear_forest_predictions.png)
+
+## References
+- Regression-Enhanced Random Forests. Haozhe Zhang, Dan Nettleton, Zhengyuan Zhu.
+- Explainable boosted linear regression for time series forecasting. Igor Ilic, Berk Gorgulu, Mucahit Cevik, Mustafa Gokce Baydogan.
diff --git a/imgs/linear_boost_importances.png b/imgs/linear_boost_importances.png
diff --git a/imgs/linear_forest_predictions.png b/imgs/linear_forest_predictions.png
diff --git a/lineartree/_classes.py b/lineartree/_classes.py
@@ -7,6 +7,9 @@
 
 from sklearn.dummy import DummyClassifier
 from sklearn.tree import DecisionTreeRegressor, DecisionTreeClassifier
+from sklearn.ensemble import RandomForestRegressor
+
+from sklearn.base import is_regressor
 from sklearn.base import BaseEstimator, TransformerMixin
 
 from sklearn.utils import check_array
@@ -212,18 +215,10 @@ class _LinearTree(BaseEstimator):
     Warning: This class should not be used directly. Use derived classes
     instead.
     """
-
-    def __init__(self,
-                 base_estimator,
-                 criterion,
-                 max_depth,
-                 min_samples_split,
-                 min_samples_leaf,
-                 max_bins,
-                 categorical_features,
-                 split_features,
-                 linear_features,
-                 n_jobs):
+    def __init__(self, base_estimator, *, criterion, max_depth,
+                 min_samples_split, min_samples_leaf, max_bins,
+                 categorical_features, split_features,
+                 linear_features, n_jobs):
 
         self.base_estimator = base_estimator
         self.criterion = criterion
@@ -854,8 +849,7 @@ class _LinearBoosting(TransformerMixin, BaseEstimator):
     Warning: This class should not be used directly. Use derived classes
     instead.
     """
-
-    def __init__(self, base_estimator, loss, n_estimators,
+    def __init__(self, base_estimator, *, loss, n_estimators,
                  max_depth, min_samples_split, min_samples_leaf,
                  min_weight_fraction_leaf, max_features,
                  random_state, max_leaf_nodes,
@@ -876,7 +870,7 @@ def __init__(self, base_estimator, loss, n_estimators,
         self.min_impurity_split = min_impurity_split
         self.ccp_alpha = ccp_alpha
 
-    def _fit(self, X, y, sample_weight):
+    def _fit(self, X, y, sample_weight=None):
         """Build a Linear Boosting from the training set (X, y).
 
         Parameters
@@ -890,9 +884,7 @@ def _fit(self, X, y, sample_weight):
             regression).
 
         sample_weight : array-like of shape (n_samples, ), default=None
-            Sample weights. If None, then samples are equally weighted.
-            Note that if the base estimator does not support sample weighting,
-            the sample weights are still used to evaluate the splits.
+            Sample weights.
 
         Returns
         -------
@@ -1006,4 +998,175 @@ def transform(self, X):
             pred_leaves = pred_leaves.reshape(-1, 1)
             X = np.concatenate([X, pred_leaves], axis=1)
 
-        return X
+        return X
+
+
+class _LinearForest(BaseEstimator):
+    """Base class for Linear Forest meta-estimator.
+
+    Warning: This class should not be used directly. Use derived classes
+    instead.
+    """
+    def __init__(self, base_estimator, *, n_estimators, max_depth,
+                 min_samples_split, min_samples_leaf, min_weight_fraction_leaf,
+                 max_features, max_leaf_nodes, min_impurity_decrease,
+                 min_impurity_split, bootstrap, oob_score, n_jobs,
+                 random_state, ccp_alpha, max_samples):
+
+        self.base_estimator = base_estimator
+        self.n_estimators = n_estimators
+        self.max_depth = max_depth
+        self.min_samples_split = min_samples_split
+        self.min_samples_leaf = min_samples_leaf
+        self.min_weight_fraction_leaf = min_weight_fraction_leaf
+        self.max_features = max_features
+        self.max_leaf_nodes = max_leaf_nodes
+        self.min_impurity_decrease = min_impurity_decrease
+        self.min_impurity_split = min_impurity_split
+        self.bootstrap = bootstrap
+        self.oob_score = oob_score
+        self.n_jobs = n_jobs
+        self.random_state = random_state
+        self.ccp_alpha = ccp_alpha
+        self.max_samples = max_samples
+
+    def _sigmoid(self, y):
+        """Expit function (a.k.a. logistic sigmoid).
+
+        Parameters
+        ----------
+        y : array-like of shape (n_samples, )
+            The array to apply expit to element-wise.
+
+        Returns
+        -------
+        y : array-like of shape (n_samples, )
+            Expits.
+        """
+        return np.exp(y) / (1 + np.exp(y))
+
+    def _inv_sigmoid(self, y):
+        """Logit function.
+
+        Parameters
+        ----------
+        y : array-like of shape (n_samples, )
+            The array to apply logit to element-wise.
+
+        Returns
+        -------
+        y : array-like of shape (n_samples, )
+            Logits.
+        """
+        y = y.clip(1e-3, 1 - 1e-3)
+
+        return np.log(y / (1 - y))
+
+    def _fit(self, X, y, sample_weight=None):
+        """Build a Linear Boosting from the training set (X, y).
+
+        Parameters
+        ----------
+        X : array-like of shape (n_samples, n_features)
+            The training input samples.
+
+        y : array-like of shape (n_samples, ) or also (n_samples, n_targets) for
+            multitarget regression.
+            The target values (class labels in classification, real numbers in
+            regression).
+
+        sample_weight : array-like of shape (n_samples, ), default=None
+            Sample weights.
+
+        Returns
+        -------
+        self : object
+        """
+        if not hasattr(self.base_estimator, 'fit_intercept'):
+            raise ValueError("Only linear models are accepted as base_estimator. "
+                             "Select one from linear_model class of scikit-learn.")
+
+        if not is_regressor(self.base_estimator):
+            raise ValueError("Select a regressor linear model as base_estimator.")
+
+        n_sample, self.n_features_in_ = X.shape
+
+        if hasattr(self, 'classes_'):
+            class_to_int = dict(map(reversed, enumerate(self.classes_)))
+            y = np.array([class_to_int[i] for i in y])
+            y = self._inv_sigmoid(y)
+
+        self.base_estimator_ = deepcopy(self.base_estimator)
+        self.base_estimator_.fit(X, y, sample_weight)
+        resid = y - self.base_estimator_.predict(X)
+
+        self.forest_estimator_ = RandomForestRegressor(
+            n_estimators=self.n_estimators,
+            criterion='mse',
+            max_depth=self.max_depth,
+            min_samples_split=self.min_samples_split,
+            min_samples_leaf=self.min_samples_leaf,
+            min_weight_fraction_leaf=self.min_weight_fraction_leaf,
+            max_features=self.max_features,
+            max_leaf_nodes=self.max_leaf_nodes,
+            min_impurity_decrease=self.min_impurity_decrease,
+            min_impurity_split=self.min_impurity_split,
+            bootstrap=self.bootstrap,
+            oob_score=self.oob_score,
+            n_jobs=self.n_jobs,
+            random_state=self.random_state,
+            ccp_alpha=self.ccp_alpha,
+            max_samples=self.max_samples
+        )
+        self.forest_estimator_.fit(X, resid, sample_weight)
+
+        if hasattr(self.base_estimator_, 'coef_'):
+            self.coef_ = self.base_estimator_.coef_
+
+        if hasattr(self.base_estimator_, 'intercept_'):
+            self.intercept_ = self.base_estimator_.intercept_
+
+        self.feature_importances_ = self.forest_estimator_.feature_importances_
+
+        return self
+
+    def apply(self, X):
+        """Apply trees in the forest to X, return leaf indices.
+
+        Parameters
+        ----------
+        X : array-like of shape (n_samples, n_features)
+            The input samples.
+
+        Returns
+        -------
+        X_leaves : ndarray of shape (n_samples, n_estimators)
+            For each datapoint x in X and for each tree in the forest,
+            return the index of the leaf x ends up in.
+        """
+        check_is_fitted(self, attributes='base_estimator_')
+
+        return self.forest_estimator_.apply(X)
+
+    def decision_path(self, X):
+        """Return the decision path in the forest.
+
+        Parameters
+        ----------
+        X : array-like of shape (n_samples, n_features)
+            The input samples.
+
+        Returns
+        -------
+        indicator : sparse matrix of shape (n_samples, n_nodes)
+            Return a node indicator matrix where non zero elements indicates
+            that the samples goes through the nodes. The matrix is of CSR
+            format.
+
+        n_nodes_ptr : ndarray of shape (n_estimators + 1, )
+            The columns from indicator[n_nodes_ptr[i]:n_nodes_ptr[i+1]]
+            gives the indicator value for the i-th estimator.
+        """
+        check_is_fitted(self, attributes='base_estimator_')
+
+        return self.forest_estimator_.decision_path(X)
diff --git a/lineartree/lineartree.py b/lineartree/lineartree.py
diff --git a/notebooks/README.md b/notebooks/README.md
diff --git a/notebooks/plots.ipynb b/notebooks/plots.ipynb
diff --git a/notebooks/usage-LinearBoost.ipynb b/notebooks/usage-LinearBoost.ipynb
@@ -302,7 +302,7 @@
     {
      "data": {
       "text/plain": [
-       "((8000, 65), (8000, 2), 0.9999998993609571)"
+       "((8000, 65), (8000, 2), 0.9999999647597971)"
       ]
      },
      "execution_count": 13,

diff --git a/notebooks/usage-LinearForest.ipynb b/notebooks/usage-LinearForest.ipynb
@@ -0,0 +1,351 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "from sklearn.linear_model import *\n",
+    "from lineartree import LinearForestClassifier, LinearForestRegressor\n",
+    "\n",
+    "from sklearn.datasets import make_classification, make_regression\n",
+    "\n",
+    "import warnings\n",
+    "warnings.simplefilter('ignore')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# REGRESSION"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "((8000, 15), (8000,))"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "n_sample, n_features = 8000, 15\n",
+    "X, y = make_regression(n_samples=n_sample, n_features=n_features, n_targets=1, \n",
+    "                       n_informative=5, shuffle=True, random_state=33)\n",
+    "\n",
+    "X.shape, y.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "LinearForestRegressor(base_estimator=Ridge())"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "regr = LinearForestRegressor(Ridge())\n",
+    "regr.fit(X, y)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "((8000,), (8000, 100), (101,), 0.9999999999366492)"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "regr.predict(X).shape, regr.apply(X).shape, regr.decision_path(X)[-1].shape, regr.score(X,y)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### multi-target regression with weights "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "((8000, 15), (8000, 2))"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "n_sample, n_features = 8000, 15\n",
+    "X, y = make_regression(n_samples=n_sample, n_features=n_features, n_targets=2, \n",
+    "                       n_informative=5, shuffle=True, random_state=33)\n",
+    "W = np.random.uniform(1,3, (n_sample,))\n",
+    "\n",
+    "X.shape, y.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "LinearForestRegressor(base_estimator=Ridge())"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "regr = LinearForestRegressor(Ridge())\n",
+    "regr.fit(X, y, W)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "((8000, 2), (8000, 100), (101,), 0.9999999999795455)"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "regr.predict(X).shape, regr.apply(X).shape, regr.decision_path(X)[-1].shape, regr.score(X,y)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# BINARY CLASSIFICATION"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "((8000, 15), (8000,))"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "n_sample, n_features = 8000, 15\n",
+    "X, y = make_classification(n_samples=n_sample, n_features=n_features, n_classes=2, \n",
+    "                           n_redundant=4, n_informative=5,\n",
+    "                           n_clusters_per_class=1,\n",
+    "                           shuffle=True, random_state=33)\n",
+    "\n",
+    "X.shape, y.shape"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### default configuration"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "LinearForestClassifier(base_estimator=Ridge())"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "clf = LinearForestClassifier(Ridge())\n",
+    "clf.fit(X, y)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "((8000,), (8000, 2), (8000, 100), (101,), 1.0)"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "clf.predict(X).shape, clf.predict_proba(X).shape, clf.apply(X).shape, clf.decision_path(X)[-1].shape, clf.score(X,y)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# MULTI-CLASS CLASSIFICATION"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "((8000, 15), (8000,))"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "n_sample, n_features = 8000, 15\n",
+    "X, y = make_classification(n_samples=n_sample, n_features=n_features, n_classes=3, \n",
+    "                           n_redundant=4, n_informative=5,\n",
+    "                           n_clusters_per_class=1,\n",
+    "                           shuffle=True, random_state=33)\n",
+    "\n",
+    "X.shape, y.shape"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### default configuration"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "OneVsRestClassifier(estimator=LinearForestClassifier(base_estimator=Ridge()))"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from sklearn.multiclass import OneVsRestClassifier\n",
+    "\n",
+    "clf = OneVsRestClassifier(LinearForestClassifier(Ridge()))\n",
+    "clf.fit(X, y)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "((8000,), (8000, 3), 1.0)"
+      ]
+     },
+     "execution_count": 13,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "clf.predict(X).shape, clf.predict_proba(X).shape, clf.score(X,y)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/notebooks/usage-LinearTree.ipynb b/notebooks/usage-LinearTree.ipynb
@@ -30,7 +30,18 @@
    "cell_type": "code",
    "execution_count": 2,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "((5000, 10), (5000,))"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "X, y = make_regression(n_samples=5000, n_features=10,\n",
     "                       n_informative=3, n_targets=1,\n",
@@ -42,7 +53,9 @@
     "\n",
     "t = X[:,5] > np.quantile(X[:,5], 0.7)\n",
     "y[t] += X[t][:,6]*X[t][:,7]\n",
-    "y[~t] += X[~t][:,8]*X[~t][:,9]"
+    "y[~t] += X[~t][:,8]*X[~t][:,9]\n",
+    "\n",
+    "X.shape, y.shape"
    ]
   },
   {
@@ -495,7 +508,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### mse criterion"
+    "### mae criterion"
    ]
   },
   {
@@ -640,7 +653,18 @@
    "cell_type": "code",
    "execution_count": 26,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "((5000, 10), (5000, 2))"
+      ]
+     },
+     "execution_count": 26,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "X, y = make_regression(n_samples=5000, n_features=10,\n",
     "                       n_informative=3, n_targets=2,\n",
@@ -654,7 +678,9 @@
     "y[np.ix_(t,[1])] += (X[t][:,6]*X[t][:,7]).reshape(-1,1)\n",
     "y[np.ix_(~t,[1])] += (X[~t][:,8]*X[~t][:,9]).reshape(-1,1)\n",
     "\n",
-    "W = np.random.uniform(1,3, (X.shape[0],))"
+    "W = np.random.uniform(1,3, (X.shape[0],))\n",
+    "\n",
+    "X.shape, y.shape"
    ]
   },
   {
@@ -686,8 +712,8 @@
     {
      "data": {
       "text/plain": [
-       "array([0.        , 0.12975824, 0.        , 0.27123609, 0.13427062,\n",
-       "       0.        , 0.        , 0.        , 0.25395981, 0.21077524])"
+       "array([0.        , 0.1470223 , 0.        , 0.30881021, 0.        ,\n",
+       "       0.        , 0.        , 0.        , 0.30423059, 0.2399369 ])"
       ]
      },
      "execution_count": 28,
@@ -707,7 +733,7 @@
     {
      "data": {
       "text/plain": [
-       "((5000, 2), (5000,), (5000, 15), 0.9999138806604908)"
+       "((5000, 2), (5000,), (5000, 13), 0.9999120763442333)"
       ]
      },
      "execution_count": 29,

diff --git a/setup.py b/setup.py
@@ -3,7 +3,7 @@
 
 HERE = pathlib.Path(__file__).parent
 
-VERSION = '0.2.0'
+VERSION = '0.3.0'
 PACKAGE_NAME = 'linear-tree'
 AUTHOR = 'Marco Cerliani'
 AUTHOR_EMAIL = 'cerlymarco@gmail.com'