Exploratory Analysis and Geolocation of Data To Help Student Find Housing Facilities
Exploratory Analysis and Geolocation of Data To Help Student Find Housing Facilities
Exploratory Analysis and Geolocation of Data To Help Student Find Housing Facilities
ISSN No:-2456-2165
Abstract:- Geolocation of data is a important topic in To prove how statistical analysis can be proved very
academic research. Geolocation and the use of useful for geolocation of data, this paper presents a
geographic information systems have become technique for evaluating the similarity between various
fundamental tools in many disciplines Because they can variables [2]. In this paper we combine the largest minimum
link databases and display geographic data, geographic distance algorithm and the traditional K-Means algorithm to
information systems have become essential tools in many propose an improved K-Means clustering algorithm. The
disciplines. This project is basically to help students who improved K-Means not only keeps the high efficiency of
relocate from various places for study purposes. This standard K-Means but also raises the speed of convergence
app use geolocation of data to try to provide the students effectively by improving the way of selecting initial cluster
best location for there stay. In this app many factors are focal point [3]. Study of this paper describes the behavior of
used to try to provide the with the best location. K-means algorithm. Through this paper we have try to
overcome the limitations of K-means algorithm by proposed
I. INTRODUCTION algorithm in this paper we presented an algorithm for
performing K-means clustering. The experimental result
Geolocation technology collects real-time information demonstrated that our scheme wants to improve the direct
about various people and clusters to find the best location. K-means algorithm [4].
This information is typically used for location monitoring as
well as for analysis of location. From an operational point of B. Survey of the Existing Application
view, the geolocation simply identifies the type of data and NoBroker [14]
uses it to cluster to find the ideal location [13]. Traditionally Magicbricks [15]
students that are new to the city face a lot of problems while 99acres [16]
relocating for college or for job purposes. We have seen this Flatchat [17]
in person when some of our friends relocated and all the Nestaway [18]
problems that they faced. So this geolocation will help all
the students to familarize themselves with the area and get All these applications just provide housing by using
the best roommates for the specific period of time. Finding a minimal features that are of no interest to students.
ideal place during your education is a important thing. as
you may not find all the things that you need. We here are C. Components In The System
going to use kmeans clustering and try to make it easier.
Many times the student don't get all the necessary things Hardware
around them so we here are going to try and consider all the Hardware used is a computer system running on
factors and try to provide them the best results. Windows 10 or above. RAM 4GB or above. CPU 1.8GHz
processor and above.
II. LITERATURE SURVEY
Software
To better understand the project, we studied different Software requirements are Android Studio, XML,
papers, videos and websites. Some of the papers that were Android Phone/Emulator, Google collab, Folium.
studied are given below also there findings are stated.
Libraries Used
A. Related Work Pandas
Implementing vector analysis in geological data to Seaborn
figure out exact position and better placement of geological
Folium
data. The vector analysis approach works when only three
Numpy
points are available. There may be occasions where more
measurements are available, we wish to use all the Matplotlib
measurements at once [1]. Geopy
Sklearn
Scipy
III. METHODOLOGY
Data visualisation
Data visualization is the graphical representation of
information and data. By using visual elements like charts,
graphs, and maps, data visualization tools provide an
accessible way to see and understand trends, outliers, and
patterns in data. Additionally, it provides an excellent way
for employees or business owners to present data to non-
technical audiences without confusion. Data visualisation
can prove to be very helpful as we can see the data
beforehand and how it is and all the trends can be observed.
[20]
Fig 2 Boxplot
Pairplot
A pairplot plot a pairwise relationships in a dataset. Implement K-means clustering on data
The pairplot function creates a grid of Axes such that each K-Means Unsupervised learning process called
variable in data will by shared in the y-axis across a single clustering divides the unlabelled dataset into many clusters.
row and in the x-axis across a single column. That creates Here, K specifies how many pre-defined clusters must be
plots as shown below.[21] produced as part of the process. For example, if K=2, there
will be two clusters; if K=10, there will be three clusters;
and so on. It provides a straightforward method for
categorising the groups in the unlabelled dataset on our own,
without the requirement for any training. It also enables us
to cluster the data into several groups. Each cluster has a
centroid assigned to it because the algorithm is centroid-
based. This algorithm's primary goal is to reduce the total
distances between each data point and its corresponding
clusters. K-means clustering is utilised in this project. Here
we will be setting the max value of K to be 10 it can be
anything between that.
Data cleaning
Data cleaning process for extracting necessary data
columns from the dataset. As we saw above cleaning the
data is very important. Having clean data will ultimately
increase overall productivity and allow for the highest
quality information in your decision-making. Benefits
include: Removal of errors when multiple sources of data
are at play. Fewer errors make for happier clients and less-
frustrated employees. So the actually dataset was extremely
vast and consisted of data which was not required. So in data
cleaning process we deleted the unwanted columns and
made the dataset ready for data visualisation. As we have a
big dataset all the factors won’t be necessary to us so what
we do is we take only the necessary data columns so that we Fig 5 Plotted Map
have a more accurate and precise data going into the further
part of the process.
Login Page