DIP Lab 13 DBSCAN Clustering

Department of Electrical Engineering
Faculty Member: Date:
Semester:
Digital Image Processing

Lab 13: DBSCAN
Lab Report Quiz/viva

Name Reg. 10 Marks 5 Marks
No
Lab#13: DBSCAN
Objectives
This laboratory exercise is focused DBSCAN clustering which is a widely used
unsupervised learning technique. Clustering is used on unlabeled data to look
for interesting groups and patterns.
Lab Instructions
 This lab activity comprises of following parts: Lab Exercises, and Post-Lab
Viva/Quiz session.
 The lab report shall be uploaded on LMS.
 Only those tasks that are completed during the allocated lab time will be credited
to the students. Students are however encouraged to practice on their own in spare
time for enhancing their skills.
Lab Report Instructions
All questions should be answered precisely to get maximum credit. Lab report must
ensure following items:
 Lab objectives
 Python codes
 Results (graphs/tables) duly commented and discussed
 Conclusion
DBSCAN (Density-Based Spatial Clustering of Applications with Noise)
DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a

popular clustering algorithm in machine learning and data mining. It is
particularly useful for identifying clusters of data points in a dataset based on the
density of points in the feature space. Unlike k-means or hierarchical clustering,
DBSCAN doesn't require specifying the number of clusters beforehand and can
discover clusters of arbitrary shapes.
Here is the basic algorithm for DBSCAN:
1. Input:
 Dataset: D={x1,x2,...,xn}, where xi is a data point in the feature space.
 Parameters:
 Epsilon (ε): The maximum distance between two points for one to
be considered as in the neighborhood of the other.
 MinPts: The minimum number of points required to form a dense
region.
Exploring two concepts known as Density Reachability and Density Connectivity

helps in understanding these parameters.
Density Reachability, with respect to density, defines a point as reachable from

another if it is within a specific distance (epsilon) from it.
Density Connectivity, on the other hand, employs a transitivity-based chaining

approach to ascertain if points belong to a specific cluster. For instance, points p
and q may be connected if p->r->s->t->q, where a->b signifies that b is in the
neighborhood of a.
2. Algorithm:
 For each data point p in the dataset D:
 If p is not visited:
 Mark p as visited.
 Find all points in the ε-neighborhood of p (including p).
 If the number of points in the neighborhood is less than
MinPts, mark p as noise.
 Otherwise, create a new cluster and add p to the cluster.
 Expand the cluster by adding all reachable points in the ε-
neighborhood to the cluster.
3. Output:
 The algorithm identifies clusters of data points and marks some points
as noise if they don't belong to any cluster.
In the algorithm, a point q is considered to be in the ε-neighborhood of p if the
distance between p and q is less than or equal to ε. The algorithm classifies points
into three categories:
 Core points: Points with at least MinPts points in their ε-neighborhood.
 Border points: Points with fewer than MinPts points in their ε-neighborhood
but are reachable from a core point.
 Noise points: Points that are neither core nor border points.
Figure 1:Credit https://www.theaidream.com/post/dbscan-clustering-algorithm-in-machine-learning
DBSCAN has advantages such as being robust to outliers and capable of

discovering clusters of arbitrary shapes. However, it may struggle with datasets
of varying densities, and choosing appropriate values for ε and MinPts can be
challenging.
Figure 2: Credits: https://github.com/NSHipster/DBSCAN
Lab Task – Your Own Dataset ___________________________________________________

Download your own CSV dataset from the internet (e.g. Kaggle). Perform
DBSCAN clustering of your dataset and showcase the plots .
Lab Task 6 – Take home(optional)
Download your own CSV dataset from the internet e.g heatmap. Perform
Hierarchical clustering of your dataset and showcase the plots .

DIP Lab 13 DBSCAN Clustering

Uploaded by

Copyright:

Available Formats

DIP Lab 13 DBSCAN Clustering

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DIP Lab 13 DBSCAN Clustering

Uploaded by

Copyright:

Available Formats

Department of Electrical Engineering

Faculty Member: Date:

Digital Image Processing

Lab Report Quiz/viva

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a

Exploring two concepts known as Density Reachability and Density Connectivity

Density Reachability, with respect to density, defines a point as reachable from

Density Connectivity, on the other hand, employs a transitivity-based chaining

DBSCAN has advantages such as being robust to outliers and capable of

Lab Task – Your Own Dataset ___________________________________________________

You might also like