Machine Learning Lab: Raheel Aslam (74-FET/BSEE/F16)
Machine Learning Lab: Raheel Aslam (74-FET/BSEE/F16)
Machine Learning Lab: Raheel Aslam (74-FET/BSEE/F16)
iris = load_iris()
X = pd.DataFrame(iris.data, columns=iris.feature_names)
y = pd.Categorical.from_codes(iris.target, iris.target_names)
X.shape
X.head()
iris.target_names
df = X.join(pd.Series(y, name='class'))
class_feature_means = pd.DataFrame(columns=iris.target_names)
for c, rows in df.groupby('class'):
class_feature_means[c] = rows.mean()
class_feature_means
within_class_scatter_matrix = np.zeros((4,4))
for c, rows in df.groupby('class'):
rows = rows.drop(['class'], axis=1)
s = np.zeros((4,4))
for index, row in rows.iterrows():
x, mc = row.values.reshape(4,1), class_feature_means[c].values.reshape(4,1)
s += (x - mc).dot((x - mc).T)
within_class_scatter_matrix += s
feature_means = df.mean()
between_class_scatter_matrix = np.zeros((4,4))
for c in class_feature_means:
n = len(df.loc[df['class'] == c].index)
mc, m = class_feature_means[c].values.reshape(4,1),
feature_means.values.reshape(4,1)
between_class_scatter_matrix += n * (mc - m).dot((mc - m).T)
eigen_values, eigen_vectors =
np.linalg.eig(np.linalg.inv(within_class_scatter_matrix).dot(between_class_scatter_ma
trix))
pairs = [(np.abs(eigen_values[i]), eigen_vectors[:,i]) for i in
range(len(eigen_values))]
pairs = sorted(pairs, key=lambda x:x[0], reverse=True)
for pair in pairs:
print(pair[0])
eigen_value_sums = sum(eigen_values)
print('Explained Variance')
Output:
3.3340137930233347
0.027034739042874168
3.379438918053471e-16
3.379438918053471e-16
Explained Variance
Eigenvector 0: 0.9919564568065745
Eigenvector 1: 0.008043543193425574
Eigenvector 2: 1.0054716216715731e-16
Eigenvector 3: 1.0054716216715731e-16