Home Work
Home Work
Home Work
https://www.kaggle.com/azzion/iris-data-set-classification-using-neural-network/data
Classify the IRIS data set via Artificial Neural Networks ANN
#testing prediction !!
correct_prediction = tf.equal(tf.argmax(y_softmax), tf.argmax(Y_train_flatten))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
1/13/20 1
print("the Accuracy is :"+str(sess.run(accuracy, feed_dict={X: X_train_flatten, Y:
Y_train_flatten})))
Accuracy is 0.94
Accuracy : 0.78
1/13/20 2
crew. Although there was some element of luck involved in surviving the sinking, some groups of people were
more likely to survive than others, such as women, children, and the upper-class.Classify the people were likely to
survive using the following classifiers:
a. Artificial Neural Networks
1/13/20 3
Creating Neural Network model:
model = Sequential()
return model
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 8) 136
_________________________________________________________________
dropout_1 (Dropout) (None, 8) 0
_________________________________________________________________
dense_2 (Dense) (None, 1) 9
=================================================================
Total params: 145
1/13/20 4
Trainable params: 145
Non-trainable params: 0
_________________________________________________________________
None
1/13/20 5
b. Compare performance to Logistic Regression or Naïve Bayes
import pandas as pd
1/13/20 6
import matplotlib.pyplot as plt
plt.rc("font", size=14)
import seaborn as sns
sns.set(style="white") #white background style for seaborn plots
sns.set(style="whitegrid", color_codes=True)
In [2]:
# Read CSV train data file into DataFrame
train_df = pd.read_csv("../input/train.csv")
cols = ["Age","Fare","TravelAlone","Pclass_1","Pclass_2","Embarked_C","Embarked_S","Sex_male","IsMinor"]
X = final_train[cols]
y = final_train['Survived']
# Build a logreg and compute the feature importances
model = LogisticRegression()
# create the RFE model and select 8 attributes
rfe = RFE(model, 8)
rfe = rfe.fit(X, y)
# summarize the selection of the attributes
print('Selected features: %s' % list(X.columns[rfe.support_]))
from sklearn.feature_selection import RFECV
# Create the RFE object and compute a cross-validated score.
# The "accuracy" scoring is proportional to the number of correct classifications
rfecv = RFECV(estimator=LogisticRegression(), step=1, cv=10, scoring='accuracy')
rfecv.fit(X, y)
plt.subplots(figsize=(8, 5))
1/13/20 7
sns.heatmap(X.corr(), annot=True, cmap="RdYlGn")
plt.show()
idx = np.min(np.where(tpr > 0.95)) # index of the first threshold for which the sensibility > 0.95
plt.figure()
plt.plot(fpr, tpr, color='coral', label='ROC curve (area = %0.3f)' % auc(fpr, tpr))
plt.plot([0, 1], [0, 1], 'k--')
plt.plot([0,fpr[idx]], [tpr[idx],tpr[idx]], 'k--', color='blue')
plt.plot([fpr[idx],fpr[idx]], [0,tpr[idx]], 'k--', color='blue')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate (1 - specificity)', fontsize=14)
plt.ylabel('True Positive Rate (recall)', fontsize=14)
plt.title('Receiver operating characteristic (ROC) curve')
plt.legend(loc="lower right")
plt.show()
print("Using a threshold of %.3f " % thr[idx] + "guarantees a sensitivity of %.3f " % tpr[idx] +
"and a specificity of %.3f" % (1-fpr[idx]) +
1/13/20 8
", i.e. a false positive rate of %.2f%%." % (np.array(fpr[idx])*100))
https://www.kaggle.com/jamesleslie/titanic-neural-network-for-beginners
Q3 Review the ANN Algorithm
And apply ann.py to churn modelling as in the dataset for churn modeling
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
1/13/20 9
onehotencoder = OneHotEncoder(categorical_features = [1])
X = onehotencoder.fit_transform(X).toarray()
X = X[:, 1:]
1/13/20 10
# Predicting the Test set results
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)
Accuracy= 0.8625
1/13/20 11
1/13/20 12