ML-Lab Manual - NEP - DSS
ML-Lab Manual - NEP - DSS
ML-Lab Manual - NEP - DSS
1. Install and set up Python and essential libraries like NumPy and
pandas.
Installing Python:
1. Download Python: Go to the official Python website download the latest
version suitable for your operating system (Windows, macOS, or Linux).
2. Install Python:
• For Windows: Run the downloaded installer and make sure to check the
box that says "Add Python x.x to PATH" during installation.
• For Linux: Python might already be installed. If not, use your package
manager to install it (e.g., sudo apt-get install python3 for Ubuntu).
3. Verify Installation:
• Open a command prompt (Windows) or terminal (macOS/Linux).
• Type python --version or python3 --version and press Enter. You
should see the installed Python version.
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 1 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 2 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
Example:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn import metrics
# Load Iris dataset (a popular example dataset in machine learning)
iris = load_iris()
X = iris.data # Features
y = iris.target # Target variable
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
# Initialize the model (K-Nearest Neighbors Classifier in this case)
model = KNeighborsClassifier(n_neighbors=3)
# Train the model
model.fit(X_train, y_train)
# Make predictions
predictions = model.predict(X_test)
# Evaluate model accuracy
accuracy = metrics.accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy}")
OUTPUT:
Accuracy: 0.9333333333333333
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 3 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
Simple Example
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
# Load an example dataset (iris dataset)
iris = datasets.load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target,
test_size=0.3)
# Initialize and train a classifier (K-Nearest Neighbors)
clf = KNeighborsClassifier(n_neighbors=3)
clf.fit(X_train, y_train)
# Evaluate the classifier
accuracy = clf.score(X_test, y_test)
print(f"Accuracy: {accuracy}")
OUTPUT:
Accuracy: 0.9777777777777777
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 4 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
4. Write a program to Load and explore the dataset of .CVS and excel
files using pandas.
Load and Explore CSV File:
import pandas as pd
# Load CSV file
csv_data = pd.read_csv('train.csv')
# Display the first few rows of the CSV file
print("First few rows of CSV file:")
print(csv_data.head())
# Summary statistics
print("\nSummary statistics of CSV file:")
print(csv_data.describe())
# Information about columns
print("\nInformation about columns in CSV file:")
print(csv_data.info())
Output:
First few rows of CSV file:
PassengerId Survived Pclass \
0 1 0 3
1 2 1 1
2 3 1 3
3 4 1 1
4 5 0 3
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 5 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
Parch Fare
count 891.000000 891.000000
mean 0.381594 32.204208
std 0.806057 49.693429
min 0.000000 0.000000
25% 0.000000 7.910400
50% 0.000000 14.454200
75% 0.000000 31.000000
max 6.000000 512.329200
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 6 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
Output:
First few rows of Excel file:
Row ID Order ID Order Date Ship Date Ship Mode Customer ID \
0 1 CA-2016-152156 2016-11-08 2016-11-11 Second Class CG-
12520
1 2 CA-2016-152156 2016-11-08 2016-11-11 Second Class CG-
12520
2 3 CA-2016-138688 2016-06-12 2016-06-16 Second Class DV-
13045
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 7 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
Discount Profit
0 0.00 41.9136
1 0.00 219.5820
2 0.00 6.8714
3 0.45 -383.0310
4 0.20 2.5164
[5 rows x 21 columns]
Discount Profit
count 9994.000000 9994.000000
mean 0.156203 28.656896
min 0.000000 -6599.978000
25% 0.000000 1.728750
50% 0.200000 8.666500
75% 0.200000 29.364000
max 0.800000 8399.976000
std 0.206452 234.260108
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 10 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 11 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
Output:
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 12 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
df['B_encoded'] = label_encoder.fit_transform(df['B'])
print("\nDataSet after handling Missing Values of B After Label
encoding:\n", df['B_encoded'])
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 13 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
one_hot_encoder = OneHotEncoder()
encoded_data = one_hot_encoder.fit_transform(df[['B_encoded']]).toarray()
encoded_df = pd.DataFrame(encoded_data, columns=[f'B_{i}' for i in
range(encoded_data.shape[1])])
df1 = pd.concat([df, encoded_df], axis=1)
print("DataSet after handling Missing Values of B After
one_hot_encoder:\n",df1)
Output:
DataSet:
A B C
0 1.0 X 7.0
1 2.0 None 8.0
2 NaN Y 9.0
3 4.0 Z NaN
4 5.0 X 11.0
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 14 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
Encoding
....................................................................
DataSet after handling Missing Values of B Before Label encoding:
0 X
1 Unknown
2 Y
3 Z
4 X
Name: B, dtype: object
Feature scaling
....................................................................
Feature Scaling using Standard scaler
A B C B_encoded A_scaled C_scaled
0 1.0 X 7.00 1 -1.414214 -1.322876
1 2.0 Unknown 8.00 0 -0.707107 -0.566947
2 3.0 Y 9.00 2 0.000000 0.188982
3 4.0 Z 8.75 3 0.707107 0.000000
4 5.0 X 11.00 1 1.414214 1.700840
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 15 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
# Load the Iris dataset (or any other dataset you want to use)
iris = load_iris()
X = iris.data
y = iris.target
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 16 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
Output:
Accuracy: 1.0
Classification Report:
precision recall f1-score support
accuracy 1.00 45
macro avg 1.00 1.00 1.00 45
weighted avg 1.00 1.00 1.00 45
Confusion Matrix:
[[19 0 0]
[ 0 13 0]
[ 0 0 13]]
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 17 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 18 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
Output:
Mean Squared Error (MSE):
431.59967479663896
R-squared:
0.375734632146025
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 19 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 20 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 21 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 22 | 23
BCA VI SEM ML-LAB MANUAL (NEP)
SHIVASWAMY D S
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUETR SCIENCE
SHESHADRIPURAM COLLEGE B-20 P a g e 23 | 23