0% found this document useful (0 votes)

83 views

ML Unit 2

The document provides information about machine learning activities and data structures. It discusses the 8 common machine learning activities: 1) data collection, 2) data preprocessing, 3) feature engineering, 4) model selection, 5) model training, 6) model evaluation, 7) model deployment, and 8) model monitoring and maintenance. It also discusses the main types of data in machine learning: numerical data, categorical data, time-series data, and text data. Finally, it discusses linear and non-linear data structures commonly used in machine learning, including arrays, stacks, queues, linked lists, trees, graphs, and hash tables.

Uploaded by

dharmangibpatel27

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views

ML Unit 2

Uploaded by

dharmangibpatel27

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 25

GROW MORE FACULTY OF DIPLOMA ENGINEERING

SUBJECT - FUNDAMENTALS OF MACHINE LEARNING

CODE – 4341603

QUESTION BANK

UNIT 2 - Preparing to Model

2.1.1 Machine Learning activities

1. Data Collection: The first step in any machine learning project is to collect
relevant data. This can be done using a variety of methods, including web
scraping, surveys, and data APIs.

2. Data Preprocessing: Once data has been collected, it must be cleaned and
preprocessed. This involves removing duplicates, filling in missing values, and
transforming the data into a format suitable for machine learning algorithms.

3. Feature Engineering: Feature engineering involves selecting and extracting the

most important features from the data, which will be used as inputs to the
machine learning algorithm.
4. Model Selection: There are various machine learning models available. each with
their own strengths and weaknesses. The choice of model will depend on the
specific problem being addressed and the characteristics of the data.

5. Model Training: This involves training the machine learning model on the data to
learn patterns and relationships between the features and the target variable.

6. Model Evaluation: Once the model has been trained, it must be evaluated to
determine its accuracy and performance. This can be done using various metrics
such as accuracy, precision, recall, and F1 score.

7. Model Deployment: After the model has been evaluated. it can be deployed in
production to make predictions on new data.

8. Model Monitoring and Maintenance: Finally, the deployed model must be

monitored and maintained to ensure it continues to perform well over time. This
involves monitoring the model's accuracy, retraining it on new data as needed,
and making updates or improvements as necessary.

2.1.2 Types of data in Machine Learning

Almost anything can be turned into DATA. Building a deep understanding of the different

data types is a crucial prerequisite for doing Exploratory Data Analysis (EDA) and

Feature Engineering for Machine Learning models. You also need to convert data types

of some variables in order to make appropriate choices for visual encodings in data

visualization and storytelling.

Most data can be categorized into 4 basic types from a Machine Learning perspective:

numerical data, categorical data, time-series data, and text.

Data Types From A Machine Learning Perspective

Numerical Data

Numerical data is any data where data points are exact numbers. Statisticians also might
call numerical data, quantitative data. This data has meaning as a measurement such

as house prices or as a count, such as a number of residential properties in Los Angeles

or how many houses sold in the past year.

Numerical data can be characterized by continuous or discrete data. Continuous data

can assume any value within a range whereas discrete data has distinct values.
Numerical Data

For example, the number of students taking Python class would be a discrete data set.

You can only have discrete whole number values like 10, 25, or 33. A class cannot have

12.75 students enrolled. A student either join a class or he doesn’t. On the other hand,

continuous data are numbers that can fall anywhere within a range. Like a student could

have an average score of 88.25 which falls between 0 and 100.

The takeaway here is that numerical data is not ordered in time. They are just numbers

that we have collected.

Categorical Data

Categorical data represents characteristics, such as a hockey player’s position, team,

hometown. Categorical data can take numerical values. For example, maybe we would

use 1 for the colour red and 2 for blue. But these numbers don’t have a mathematical

meaning. That is, we can’t add them together or take the average.
In the context of super classification, categorical data would be the class label. This

would also be something like if a person is a man or woman, or property is residential or

commercial.

There is also something called ordinal data, which in some sense is a mix of numerical

and categorical data. In ordinal data, the data still falls into categories, but those

categories are ordered or ranked in some particular way. An example would be class

difficulty, such as beginner, intermediate, and advanced. Those three types of classes
would be a way that we could label the classes, and they have a natural order in

increasing difficulty.

Another example is that we just take quantitative data, and splitting it into groups, so we

have bins or categories of other types of data.

Ordinal Data

For plotting purposes, ordinal data is treated much in the same way as categorical data.

But groups are usually ordered from lowest to highest so that we can preserve this
ordering.

Time Series Data

Time series data is a sequence of numbers collected at regular intervals over some

period of time. It is very important, especially in particular fields like finance. Time series

data has a temporal value attached to it, so this would be something like a date or a

timestamp that you can look for trends in time.

For example, we might measure the average number of home sales for many years. The

difference of time series data and numerical data is that rather than having a bunch of

numerical values that don’t have any time ordering, time-series data does have some

implied ordering. There is a first data point collected and the last data point collected.
Text

Text data is basically just words. A lot of the time the first thing that you do with text is

you turn it into numbers using some interesting functions like the bag of words

formulation.

These are four types of data from a Machine Learning perspective.

2.1.3 Structures of data

Data Structure for Machine Learning

Machine Learning is one of the hottest technologies used by data scientists or ML

experts to deploy a real-time project.

However, only skills of machine learning are not sufficient for solving real-world
problems and designing a better product, but also you have to gain good exposure to
the data structure.

The data structure used for machine learning is quite similar to other software
development fields where it is often used.

Machine Learning is a subset of artificial intelligence that includes various

complex algorithms to solve mathematical problems to a great extent.

Data structure helps to build and understand these complex problems.

Understanding the data structure also helps you to build ML models and algorithms in a
much more efficient way than other ML professionals. I

What is Data Structure?

The data structure is defined as the basic building block of computer

programming that helps us to organize, manage and store data for efficient
search and retrieval.

In other words, the data structure is the collection of data type 'values' which are stored
and organized in such a way that it allows for efficient access and modification.

Types of Data Structure

The data structure is the ordered sequence of data, and it tells the compiler how a
programmer is using the data such as Integer, String, Boolean, etc.

There are two different types of data structures: Linear and Non-linear data structures.
1. Linear Data structure:

The linear data structure is a special type of data structure that helps to organize and
manage data in a specific order where the elements are attached adjacently.

There are mainly 4 types of linear data structure as follows:

Array:

An array is one of the most basic and common data structures used in Machine
Learning. It is also used in linear algebra to solve complex mathematical problems. You
will use arrays constantly in machine learning, whether it's:

o To convert the column of a data frame into a list format in pre-processing

analysis
o To order the frequency of words present in datasets.
o Using a list of tokenized words to begin clustering topics.
o In word embedding, by creating multi-dimensional matrices.

An array contains index numbers to represent an element starting from 0. The lowest
index is arr[0] and corresponds to the first element.

Let's take an example of a Python array used in machine learning. Although the Python
array is quite different from than array in other programming languages, the Python list
is more popular as it includes the flexibility of data types and their length. If anyone is
using Python in ML algorithms, then it's better to kick your journey from array initially.

Python Array method:

Method Description

Append() It is used to add an element at the end of the list.

Clear() It is used to remove/clear all elements in the list.

Copy() It returns a copy of the list.

Count() It returns the count or total available element with an integer value.

Extend() It is used to add the element of a list to the end of the current list.

Index() It returns the index of the first element with the specified value.

Insert() It is used to add an element at a specific position using an index number.

Pop() It is used to remove an element from a specified position using an index number.

Remove() Used to remove the elements with specified values.

Reverse() Used to show list in reverse order

Sort() Used to sort the list in an array.

Stacks:

Stacks are based on the concept of LIFO (Last in First out) or FILO (First In Last Out).

It is used for binary classification in deep learning.

Although stacks are easy to learn and implement in ML models but having a good grasp
can help in many computer science aspects such as parsing grammar, etc.

Stacks enable the undo and redo buttons on your computer as they function similar to
a stack of blog content.

There is no sense in adding a blog at the bottom of the stack.

However, we can only check the most recent one that has been added. Addition and
removal occur at the top of the stack.
Linked List:

A linked list is the type of collection having several separately allocated nodes. Or
in other words, a list is the type of collection of data elements that consist of a
value and pointer that point to the next node in the list.

In a linked list, insertion and deletion are constant time operations and are very efficient,
but accessing a value is slow and often requires scanning.

So, a linked list is very significant for a dynamic array where the shifting of elements is
required.

Although insertion of an element can be done at the head, middle or tail position, it is
relatively cost consuming. However, linked lists are easy to splice together and split
apart. Also, the list can be converted to a fixed-length array for fast access.

Queue:

A Queue is defined as the "FIFO" (first in, first out).

It is useful to predict a queuing scenario in real-time programs, such as people waiting

in line to withdraw cash in the bank.

Hence, the queue is significant in a program where multiple lists of codes need to be
processed.

The queue data structure can be used to record the split time of a car in F1 racing.

2. Non-linear Data Structures

As the name suggests, in Non-linear data structures, elements are not arranged in any
sequence.

All the elements are arranged and linked with each other in a hierarchal manner, where
one element can be linked with one or more elements.
1) Trees

Binary Tree:

The concept of a binary tree is very much similar to a linked list, but the only difference
of nodes and their pointers.

In a linked list, each node contains a data value with a pointer that points to the next
node in the list,

whereas; in a binary tree, each node has two pointers to subsequent nodes
instead of just one.

Binary trees are sorted, so insertion and deletion operations can be easily done with
O(log N) time complexity.

Similar to the linked list, a binary tree can also be converted to an array on the basis of
tree sorting.

2) Graphs

A graph data structure is also very much useful in machine learning for link
prediction.

Graphs are directed or undirected concepts with nodes and ordered or unordered pairs.
Hence, you must have good exposure to the graph data structure for machine learning
and deep learning.

3) Maps

Maps are the popular data structure in the programming world, which are mostly useful
for minimizing the run-time algorithms and fast searching the data.

It stores data in the form of (key, value) pair, where the key must be unique; however,
the value can be duplicated. Each key corresponds to or maps a value; hence it is
named a Map.

In different programming languages, core libraries have built-in maps or, rather,
HashMaps with different names for each implementation.
o In Java: Maps
o In Python: Dictionaries
o C++: hash_map, unordered_map, etc.

Python Dictionaries are very useful in machine learning and data science as various
functions and algorithms return the dictionary as an output. Dictionaries are also much
used for implementing sparse matrices, which is very common in Machine Learning.

4) Heap data structure:

Heap is a hierarchically ordered data structure. Heap data structure is also very much
similar to a tree, but it consists of vertical ordering instead of horizontal ordering.

Ordering in a heap DS is applied along the hierarchy but not across it, where the value
of the parent node is always more than that of child nodes either on the left or right side.

Here, the insertion and deletion operations are performed on the basis of promotion.

It means, firstly, the element is inserted at the highest available position.

After that, it gets compared with its parent and promoted until it reaches the correct
ranking position. Most of the heaps data structures can be stored in an array along with
the relationships between the elements.

Dynamic array data structure:

This is one of the most important types of data structure used in linear algebra to solve
1-D, 2-D, 3-D as well as 4-D arrays for matrix arithmetic.

Further, it requires good exposure to Python libraries such as Python NumPy for
programming in deep learning.

2.1.3 Data quality and remediation

Principles for Data Quality

On starting a data quality project an organisation needs to decide the principles to

underpin its approach.

These guide any remediation project from beginning to end.

Of course, the rule for principles is that there should not be many, but they should be
unbreakable - the below are critical to the successful execution of a data quality project.

Principle 1: The Business is in the driving seat

Clarity is required at the start that data quality is a business problem and must be
solved by the business.

The IT department cannot and should not be running a data quality project. At the very
start both business and IT need to understand:

 The business is responsible for the data.

 The business is responsible for the quality of the data.

 The business is responsible for the remediation of the data.

 The business is responsible for defining the quality of the data needed.

If a data quality project is run by IT then it is most likely to fail.

However, the business needs to work in concert with IT to achieve their aims.

A data quality implementation needs to bring together the business and IT professionals
to work together for the benefit of a common goal.

This brings me to the next principle.

Principle 2: We’re all in this together.

The business cannot do it alone.

Data sits on IT systems and they are normally the only department with direct access.
Data quality remediation work normally divides into that work done manually, and that
done either via a bulk update or via a data quality application.

If a data quality improvement requires a bulk update of data then IT are the only ones
placed to perform the work.

If an application needs installation and configuration then IT are likely the best placed to
do the work.

Involvement of IT will be critical when it comes to remediation.

Equally, technology underpins processes, as most processes run at least partly on

technology systems.

If the data quality remediation is looking to change process this will also require the buy-
in of IT.

A Data Quality project should be a healthy partnership between the business and IT.

Principle 3: Do it once, do it well

A Data Quality project should only ever be implemented once.

It is often a large endeavour which will draw on resources from all areas of the
organisation.

It is not something that any organisation should want to do twice.

There is no point cleaning up the data for it to revert to a poor state a few months later.

This is frustratingly common, and why data quality is often seen as an insurmountable
problem.

The reason poor data quality keeps coming back is precisely because organisations,
and data quality projects, fail to think about the problem holistically.

The whole point about quality is that quickly, cheaply and badly costs you money. A
mantra for any data quality project should be;

“Do Not Run A Poor Quality Data Quality Project.”

Not understanding the above means you’ll be running another data quality project in a
couple of years, and will have wasted a lot of resources – both time and money - on the
way.
Principle 4: Treat data as an asset

Data should be treated as an asset, but what does this mean for a data quality project?
It means treat every bit of data as if it is a valuable, physical asset.

It has taken time and effort for the customer to tell you their address.

It has taken time and effort for the call centre agent or branch staff member to type it
into the information systems.

This data has then been lovingly preserved for years, religiously backed-up and used
many times for verification.

It has cost money, probably quite a lot of money. Do not discard unless you are certain
it will not be valuable now or in the future.

Take time to update data with care. Look to understand data and why it is in its present
state before deciding on a solution.

Even if you have an obvious error, do not rush to remediate as this data error may be
an example of a process or data failure that will affect many thousands of records, and
the other examples may not be as obvious.

Do not treat your existing data as simply trash to deserve obliteration and replacement
with something shiny and new.

Principle 5: People are the key

Many organisations treat data quality as a technical problem to be solved by technical

people in technical ways.

However poor data quality is a people and process problem with technological
elements, not a technological problem.

What’s more, in order to solve data quality, it is necessary to win hearts and minds of
the organisation. It is necessary to engage with people, not computers.
 It is necessary to persuade the executive that data quality is causing them to
lose money on a day to day basis.

 It is necessary to persuade business leaders that despite the word “Data”

being in the phrase “Data Quality” that solving the problem needs them to
define "what data", and "what quality".

 It is necessary to persuade the IT department that they really do want to help

solve the data quality problem as it will make their lives easier too.

 It is about persuading people there is a business case for a data quality

project.

 It is about training people to recognise poor quality when they see it.

 It is about empowering people to do something about data quality.

 It is about managing the knowledge (generally in people’s heads) of how

information flows through an organisation.

 It is about understanding how people manage, and use, information, and

how they make decisions.

A data quality project needs to understand both its people, and the people in the
organisation, what they are doing with information and how they are doing it. People are
not technology.

They have hopes, fears, and aspirations.

They are irrational and cantankerous, and are not always open to change. Their
involvement needs to be nurtured.

Principle 6: Embed data quality in the organisation

I have always advocated that data quality process should become embedded in the
organisation.

After the initial pain of cleaning historical data is complete, then the organisation must
embed a good-quality mindset - rather than poor-or-irrelevant quality mindset.

At this point the data quality project should disappear. It has now become part of the
organisation.

Whilst some degree of monitoring is necessary to gently steer the process onward, what
is not needed is a large data quality department.

The objective of the approach is to make data quality endemic in the organisation.

Quality management needs to be in place so that data quality issues can be identified
and addressed, but that is all. Data Quality should become just another operational
measurement, and only require a brief look at the dials to make sure they are in the
green.

Principle 7: Do as little work as possible

The objective of a data quality project is neither to boil the ocean, nor to make data
quality perfect.

The objective is to do the minimum possible that allows the organisation to meet its
information needs.

The end state of the data should be described as “good enough”, not “perfect”.

Once this state of affairs is reached, a data quality project is complete and should stop
work and stop spending the organisation's money.

Fundamentally, the approach should be based around minimum necessary work. Work
needs to be undertaken in as effective and efficient a manner as possible, and only
ever done once.
2.1.4 Data Pre-Processing
 Dimensionality reduction
 Feature subset selection

Data Pre-Processing

Data preprocessing is a process of preparing the raw data and making it suitable for a
machine learning model.

It is the first and crucial step while creating a machine learning model.

When creating a machine learning project, it is not always a case that we come across
the clean and formatted data.

And while doing any operation with data, it is mandatory to clean it and put in a
formatted way. So for this, we use data preprocessing task.

Why do we need Data Preprocessing?

A real-world data generally contains noises, missing values, and maybe in an unusable
format which cannot be directly used for machine learning models.

Data preprocessing is required tasks for cleaning the data and making it suitable for a
machine learning model which also increases the accuracy and efficiency of a machine
learning model.

It involves below steps:

o Getting the dataset

o Importing libraries
o Importing datasets
o Finding Missing Data
o Encoding Categorical Data
o Splitting dataset into training and test set
o Feature scaling

 Dimensionality reduction

What is Dimensionality Reduction?

The number of input features, variables, or columns present in a given dataset is known
as dimensionality, and the process to reduce these features is called dimensionality
reduction.

A dataset contains a huge number of input features in various cases, which makes the
predictive modeling task more complicated. Because it is very difficult to visualize or
make predictions for the training dataset with a high number of features, for such cases,
dimensionality reduction techniques are required to use.

Dimensionality reduction technique can be defined as, "It is a way of converting the
higher dimensions dataset into lesser dimensions dataset ensuring that it
provides similar information." These techniques are widely used in machine
learning for obtaining a better fit predictive model while solving the classification and
regression problems.

It is commonly used in the fields that deal with high-dimensional data, such as speech
recognition, signal processing, bioinformatics, etc.

It can also be used for data visualization, noise reduction, cluster analysis, etc.
The Curse of Dimensionality

Handling the high-dimensional data is very difficult in practice, commonly known as

the curse of dimensionality.

If the dimensionality of the input dataset increases, any machine learning algorithm and
model becomes more complex.

As the number of features increases, the number of samples also gets increased
proportionally, and the chance of overfitting also increases.

If the machine learning model is trained on high-dimensional data, it becomes overfitted

and results in poor performance.

Hence, it is often required to reduce the number of features, which can be done with
dimensionality reduction.
 Feature subset selection

Feature Selection is the most critical pre-processing activity in any machine learning
process.

It intends to select a subset of attributes or features that makes the most meaningful
contribution to a machine learning activity.

In order to understand it, let us consider a small example i.e. Predict the weight of
students based on the past information about similar students, which is captured
inside a ‘Student Weight’ data set.

The data set has 04 features like Roll Number, Age, Height & Weight. Roll Number
has no effect on the weight of the students, so we eliminate this feature.

So now the new data set will be having only 03 features.

This subset of the data set is expected to give better results than the full set.

Age Height Weight

12 1.1 23

11 1.05 21.6

13 1.2 24.7

11 1.07 21.3

14 1.24 25.2

12 1.12 23.4

The above data set is a reduced dataset.

Before proceeding further, we should look at the fact why we have reduced the
dimensionality of the above dataset OR what are the issues in High Dimensional
Data?

High Dimensional refers to the high number of variables or attributes or features

present in certain data sets, more so in the domains like DNA analysis, geographic
information system (GIS), etc.

It may have sometimes hundreds or thousands of dimensions which is not good from
the machine learning aspect because it may be a big challenge for any ML algorithm
to handle that.

On the other hand, a high quantity of computational and a high amount of time will be
required.

Also, a model built on an extremely high number of features may be very difficult to
understand.

For these reasons, it is necessary to take a subset of the features instead of the
full set.

So we can deduce that the objectives of feature selection are:

1. Having a faster and more cost-effective (less need for computational resources)
learning model
2. Having a better understanding of the underlying model that generates the data.
3. Improving the efficacy of the learning model.

Main Factors Affecting Feature Selection

a. Feature Relevance:
In the case of supervised learning, the input data set (which is the training data set),
has a class label attached.

A model is inducted based on the training data set — so that the inducted model can
assign class labels to new, unlabeled data.

Each of the predictor variables, ie expected to contribute information to decide the

value of the class label.

In case of a variable is not contributing any information, it is said to be irrelevant. In

case the information contribution for prediction is very little, the variable is said to be
weakly relevant.

The remaining variables, which make a significant contribution to the prediction task
are said to be strongly relevant variables.
In the case of unsupervised learning, there is no training data set or labelled data.
Grouping of similar data instances are done and the similarity of data instances are
evaluated based on the value of different variables.
Certain variables do not contribute any useful information for deciding the similarity of
dissimilar data instances.
Hence, those variable makes no significant contribution to the grouping process.
These variables are marked as irrelevant variables in the context of the unsupervised
machine learning task.
We can understand the concept by taking a real-world example:

At the start of the article, we took a random dataset of the student.

In that, Roll Number doesn’t contribute any significant information in predicting what
the Weight of a student would be.

Similarly, if we are trying to group together students with similar academic

capabilities, Roll No can really not contribute any information.

So, in the context of grouping students with similar academic merit, the variable Roll
No is quite irrelevant.

Any feature which is irrelevant in the context of a machine learning task is a candidate
for rejection when we are selecting a subset of features.

b. Feature Redundancy:

A feature may contribute to information that is similar to the information contributed by

one or more features.

For example, in the Student Data-set, both the features Age & Height contribute
similar information.

This is because, with an increase in age, weight is expected to increase. Similarly,

with the increase in Height also weight is expected to increase.

So, in context to that problem, Age and Height contribute similar information. In other
words, irrespective of whether the feature Height is present or not, the learning model
will give the same results.

In this kind of situation where one feature is similar to another feature, the feature is
said to be potentially redundant in the context of a machine learning problem.
All features having potential redundancy are candidates for rejection in the final
feature subset.

Only a few representative features out of a set of potentially redundant features are
considered for being a part of the final feature subset.
So in short, the main objective of feature selection is to remove all features which are
irrelevant and take a representative subset of the features which are potentially
redundant.
This leads to a meaningful feature subset in the context of a specific learning task.

DPV 6.1, 6.2, 6.7, 6.17, 6.21, 6.19
100% (3)
DPV 6.1, 6.2, 6.7, 6.17, 6.21, 6.19
4 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
CSE Syllebus of MBSTU
100% (2)
CSE Syllebus of MBSTU
41 pages
Introduction To Analytics
No ratings yet
Introduction To Analytics
40 pages
Data Structure 3rd Sem Engg by RSD
No ratings yet
Data Structure 3rd Sem Engg by RSD
41 pages
Report
No ratings yet
Report
11 pages
Book Machine Learning Finance Python
100% (1)
Book Machine Learning Finance Python
75 pages
Statistics for Data Science
No ratings yet
Statistics for Data Science
39 pages
Introduction To Data Structure
No ratings yet
Introduction To Data Structure
7 pages
Wa0010.
No ratings yet
Wa0010.
40 pages
DS&ML 1
No ratings yet
DS&ML 1
9 pages
FDS Module 1 Notes
No ratings yet
FDS Module 1 Notes
27 pages
Dcap407 Data Structure
100% (1)
Dcap407 Data Structure
270 pages
Machine Learning Report
No ratings yet
Machine Learning Report
73 pages
Difference Between Data Science and Machine Learning
No ratings yet
Difference Between Data Science and Machine Learning
5 pages
22mca341 - Data Science
No ratings yet
22mca341 - Data Science
109 pages
MCA -ML Question Bank Answer
No ratings yet
MCA -ML Question Bank Answer
139 pages
Data Preprocessing in Machine Learning
No ratings yet
Data Preprocessing in Machine Learning
5 pages
MAchineLearningNotes
No ratings yet
MAchineLearningNotes
6 pages
Machine Learning 2
No ratings yet
Machine Learning 2
37 pages
Data Warehouse Schemas: Mandeep Kaur Sandhu Amanjot Kaur Ramandeep Kaur
No ratings yet
Data Warehouse Schemas: Mandeep Kaur Sandhu Amanjot Kaur Ramandeep Kaur
5 pages
Introduction to ML Unit-1 PPT
No ratings yet
Introduction to ML Unit-1 PPT
90 pages
ML Workflow Steps: Step 2: Building Dataset
No ratings yet
ML Workflow Steps: Step 2: Building Dataset
5 pages
20 Questions On Feature Engineering and Eda
No ratings yet
20 Questions On Feature Engineering and Eda
9 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
4 pages
Statistics For Data Science - 1
100% (2)
Statistics For Data Science - 1
38 pages
Class IX - Chapter 2 AI Project Cycle Notes
50% (6)
Class IX - Chapter 2 AI Project Cycle Notes
11 pages
DSA - Module 1 - Part 1
No ratings yet
DSA - Module 1 - Part 1
39 pages
1 Datastructure
No ratings yet
1 Datastructure
6 pages
Assign 1 AIOU Data Structure
No ratings yet
Assign 1 AIOU Data Structure
22 pages
Chapter 1 Data Structures Introduction
No ratings yet
Chapter 1 Data Structures Introduction
26 pages
Lecture 01.1
No ratings yet
Lecture 01.1
21 pages
ML Interactively
No ratings yet
ML Interactively
273 pages
Machine Learning Career Roadmap_
No ratings yet
Machine Learning Career Roadmap_
17 pages
Introduction To Data Structures: Dept. of Computer Science Faculty of Science and Technology
No ratings yet
Introduction To Data Structures: Dept. of Computer Science Faculty of Science and Technology
20 pages
10 Machine Learning
No ratings yet
10 Machine Learning
9 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
6 pages
Data Science Module 1 q & A
No ratings yet
Data Science Module 1 q & A
16 pages
Data Science with R: Beginner to Expert
From Everand
Data Science with R: Beginner to Expert
Narayana Nemani
No ratings yet
Unit-I DS Data Structures
No ratings yet
Unit-I DS Data Structures
62 pages
Data Structure Interview Questions and Answers (Top 46)
No ratings yet
Data Structure Interview Questions and Answers (Top 46)
21 pages
CH 4
No ratings yet
CH 4
17 pages
FinalExam_AI_solution (1)
No ratings yet
FinalExam_AI_solution (1)
4 pages
Unit 3
No ratings yet
Unit 3
72 pages
Data_in_machine_learning
No ratings yet
Data_in_machine_learning
7 pages
1st Chap
No ratings yet
1st Chap
13 pages
Csa Unit-7
No ratings yet
Csa Unit-7
418 pages
ML Notes
No ratings yet
ML Notes
7 pages
Unit 1
No ratings yet
Unit 1
95 pages
AI Project Cycle Class 9 Notes
No ratings yet
AI Project Cycle Class 9 Notes
9 pages
Data Structure - Shrivastava - Ibrg
No ratings yet
Data Structure - Shrivastava - Ibrg
268 pages
Ai 4
No ratings yet
Ai 4
12 pages
UNIT1@
No ratings yet
UNIT1@
4 pages
ML_DA
No ratings yet
ML_DA
55 pages
Algorithms and Data Structures: An Easy Guide to Programming Skills
From Everand
Algorithms and Data Structures: An Easy Guide to Programming Skills
Rigdon Jonathan
No ratings yet
Data-Structure complete unit 3
No ratings yet
Data-Structure complete unit 3
68 pages
Machine Learning Unit 1
No ratings yet
Machine Learning Unit 1
72 pages
DS Notes
No ratings yet
DS Notes
21 pages
Machine Learning
No ratings yet
Machine Learning
17 pages
Unit I (Notes 2)
No ratings yet
Unit I (Notes 2)
16 pages
R programming.Q.A
No ratings yet
R programming.Q.A
13 pages
Machine Learning
No ratings yet
Machine Learning
30 pages
Cheat Sheat CPSC 331 Midterm 1
No ratings yet
Cheat Sheat CPSC 331 Midterm 1
2 pages
Tree
100% (1)
Tree
654 pages
Tree Data Structure
No ratings yet
Tree Data Structure
5 pages
Instant Download Data Structures and the Java Collections Frameworks 3rd Edition William J. Collins PDF All Chapters
100% (9)
Instant Download Data Structures and the Java Collections Frameworks 3rd Edition William J. Collins PDF All Chapters
60 pages
Uncertainty Quantification For Bayesian Cart: To Appear in The Annals of Statistics
No ratings yet
Uncertainty Quantification For Bayesian Cart: To Appear in The Annals of Statistics
67 pages
Data Structures and Algorithm
No ratings yet
Data Structures and Algorithm
61 pages
Decision Tree
No ratings yet
Decision Tree
2 pages
KCH6N Display Profit Center Hierarchy
No ratings yet
KCH6N Display Profit Center Hierarchy
6 pages
SolidWorks Tutorial BIG RED
No ratings yet
SolidWorks Tutorial BIG RED
59 pages
Design Patterns in C#
100% (2)
Design Patterns in C#
38 pages
Basics and Fundamentals of Database For Level 4
No ratings yet
Basics and Fundamentals of Database For Level 4
13 pages
Heap Leftist Trees
No ratings yet
Heap Leftist Trees
5 pages
CTS FlowGuide
100% (5)
CTS FlowGuide
14 pages
Types of Classification Algorithm
No ratings yet
Types of Classification Algorithm
27 pages
Ads Unit Ii Notes
No ratings yet
Ads Unit Ii Notes
31 pages
Assignment - TWO - MOGAJI - GABRIEL - ROTIMI - R1812D7158691 - UU-COM-4002-51738 - 01
No ratings yet
Assignment - TWO - MOGAJI - GABRIEL - ROTIMI - R1812D7158691 - UU-COM-4002-51738 - 01
15 pages
Data Structures and Algorithms
No ratings yet
Data Structures and Algorithms
33 pages
Koenig's Lemma, Compactness, and Generalization To Infinite Sets Premises
No ratings yet
Koenig's Lemma, Compactness, and Generalization To Infinite Sets Premises
4 pages
Final 450
No ratings yet
Final 450
49 pages
Binary Search Tree
No ratings yet
Binary Search Tree
6 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Bachelor of Computer Applications Hons. 1 Year: Magadh University, Bodh - Gaya
No ratings yet
Bachelor of Computer Applications Hons. 1 Year: Magadh University, Bodh - Gaya
15 pages
Heaps, Heapsort and Priority Queues: No Gaps in It. The Last Two Conditions Give A Heap A Weak Amount of Order
No ratings yet
Heaps, Heapsort and Priority Queues: No Gaps in It. The Last Two Conditions Give A Heap A Weak Amount of Order
24 pages
CSC148 20221-Exam
No ratings yet
CSC148 20221-Exam
18 pages
Adaptive Huffman Code
No ratings yet
Adaptive Huffman Code
46 pages
Coding Patterns
100% (1)
Coding Patterns
26 pages
Breadth-First Traversal of A Tree
No ratings yet
Breadth-First Traversal of A Tree
7 pages
AVL Tree
No ratings yet
AVL Tree
5 pages