Designing Machine Learning Workflows in Python Chapter1
Designing Machine Learning Workflows in Python Chapter1
Designing Machine Learning Workflows in Python Chapter1
D E S I G N I N G M A C H I N E L E A R N I N G W O R K F LO W S I N P Y T H O N
credit_scoring.head(4)
le = LabelEncoder()
le.fit_transform(credit_scoring['checking_status'])[:4]
array([1, 0, 3, 1])
.predict(features)
model_nb = GaussianNB()
model_nb.fit(features, labels)
model_nb.predict(features.head(5))
model_ab = AdaBoostClassifier()
model_ab.fit(features, labels)
model_ab.predict(features.head(5))
numpy.array(labels[0:5])
0.706
0.802
GaussianNB().fit(X_train, y_train).predict(X_test)
help(RandomForestClassifier)
m2.estimators_[0] m4.estimators_[0]
cross_val_score(RandomForestClassifier(), X, y)
numpy.mean(cross_val_score(RandomForestClassifier(), X, y))
0.7589
{'max_depth': 10}
arrhythmias.head()
numpy.unique(LabelEncoder().fit_transform(credit_scoring['purpose']))
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
purpose_business 0
purpose_buy_domestic_appliance 0
purpose_buy_furniture_equipment 0
purpose_buy_new_car 0
purpose_buy_radio_tv 1
purpose_buy_used_car 0
purpose_education 0
purpose_other 0
purpose_repairs 0
purpose_retraining 0
credit_scoring['purpose'] = credit_scoring['purpose'].apply(
lambda s: ' '.join(s.split('_')), 0)
dummy_matrix = vec.fit_transform(credit_scoring['purpose']).toarray()
pd.DataFrame(dummy_matrix, columns=vec.get_feature_names()).head()