Predicting The Admissions of Students in Masters Program Using Machine Learning
Predicting The Admissions of Students in Masters Program Using Machine Learning
Predicting The Admissions of Students in Masters Program Using Machine Learning
BACHELOR OF ENGINEERING
IN
B.E- CSE (HONS.)- IOT
Submitted by:
Mukul Sidhu(17BCS4623)
1
DECLARATION
Date: 13/04/2021
Place:
CHANDIGARH UNIVERSITY
2
CERTIFICATE
This is to certify that the work embodies in this dissertation entitled ‘Predicting the
admissions of Students in masters program using Machine Learning’ being
submitted by DIVYAM SHARMA, ABHISHEK GARG, MUKUL SIDHU,
RAHUL NARWAL, ROHIT NARWAL for partial fulfilment of the requirement
for the award of Bachelor of Engineering in CSE (HONS.)-IOT discipline to Apex
Institute of Technology, Chandigarh University, Punjab during the academic year
2021 is a record of bona fide piece of work, undertaken by him/her the supervision
of the undersigned.
Signature of Supervisor
Forwarded by
EXTERNAL EXAMINER
3
ACKNOWLEDGEMENT
I have taken efforts in this project. However, it would not have been possible without the kind
support and help of many individuals and organizations. I would like to extend my sincere
thanks to all of them.
I am highly indebted to Dr. Krisnendu Rarhi for their guidance and constant supervision as
well as for providing necessary information regarding the project & also for their support in
completing the project.
I would like to express my gratitude towards my parents & member of Chandigarh University
for their kind co-operation and encouragement which helped me in completion of this project.
I would like to express my special gratitude and thanks to industry persons for giving me such
attention and time.
My thanks and appreciations also go to my teammates in developing the project and people
who have willingly helped me out with their abilities.
4
ABSTRACT
Nowadays we see many students conducting their studies away from their home countries. The main
country targeted for these foreign students is the United States of America. Most foreign students in the
United States of America come from India and China. Over the past decade the number of Indian
students studying for graduate studies from the USA has increased rapidly. With the increasing number
of foreign students studying in the USA, each applicant has to face a tough competition to get into their
dream university. Often as students do not have much knowledge of the procedures, requirements and
details of USA universities they seek the help of educational advisor firms to help them successfully
secure entry into universities that suit their profile, as they have the potential to invest as much as
consultation funds. In addition to these educational consulting firms there are a few websites and blogs
that guide students through admission procedures. The reversal of existing resources is that they are very
limited and not truly reliable in terms of their accuracy and reliability. The purpose of this study is to
develop a system that uses machine learning algorithms, which we will call the Student Admission
Predictor (SAP). It will help students identify opportunities for their university applications to be
accepted. It will also help them identify the most relevant universities in their profile and provide them
with details of those universities. A simple user interface will be created so that users can access the SAP
system.
5
Table of Content
Title Page 1
Acknowledgement 4
Abstract 5
Introduction 7
Research Methodology 8
Implementation 10
Evaluation 13
Discussion 14
Conclusion 15
Reference 16
6
INTRODUCTION
In the empowering world of computation, technologies like machine learning and AI makes a new
changes in the field of computer science to every domain of industry. Thinking about abstract scope of
these, using the technique of machine learning we are going to predict the admissions of a students in
higher studies (master’s program) using the overall performance, scores and various other factors of
students and critically analyze the performance and outcome of the predicted model using the variety of
In short span of time ideaculate the process of being smart to predict the above scenario is much needed
to see the analytical view of admissions in higher studies (more generally in abroad program). As
machine learning provides a more robust algorithm which are capable of predicting this problem in
efficient manner.
Machine learning agility towards the defined problem and uncertain execution had increased due to
program is a uncertain hypothesis and generation of actual results might tends to true negative. Using the
moden machine learning algorithms it might be trajectory move for the given problem and generates a
Prediction of students admission deends on the various factors which are generally features in terms
machine learning and the classes(either 0 or 1) specifies that particular students get admissions or not
7
Research Methodology
Machine learning algorithms have vast tendency to behave better on the a defined set of problems more
exhaustively in side taken of the providing dataset size. So for the proof of concept of this project we are
using the data in form of csv with more internally obtaining size. More or less there are various
algorithms which we can apply on this project to better predicting the required problem with tendency of
result towards more general and optimally analyze. Also using the base-case as LR algorithm for naïve
approach.
The methods used in this project is generally defined for classification problem but not limited to that
particular use-case:
• This is a binary classification problem. The output has only two possibilities either Yes
(1) or No (0).
• Support Vector machine (SVM): Use this algorithm on the given problem. In the SVM
algorithm, we plot each data item as a point in n-dimensional space (where n is number of
features you have) with the value of each feature being the value of a particular
Linear Discriminant Analysis (LDA):It used as dimensionality reduction technique in the pre-
processing step for pattern-classification and machine learning applications. In this project we used as a
(COAP). Various different features are mandatory apart from the score. The defined hypothesis set are:
1) Gate Score
2) CGPA/SGPA
3) College Rating
5) Achievements
7) SOP
These are the various defined hypothesis for the processing admissions in master’s program.
There are various other points which not effect underlying project.
All of the hypothesis plays an important role and providing upon the given dataset. We are using
our university student’s record (primary data) for specifying admissions in higher studies (MS
program). The dataset consists of 250*7 entries excluding 1 column of feature set. The datasets
for verification and processing of project along with that test-set 80*6 excluding the prediction
column.
In this project, using the state-of art algorithm of machine learning for classification of feature set
including the dependent variable(entity). Various validations applied for further increasing rate of
9
IMPLEMENTATION
The implementation of the project is done using the main libraries for data science(Numpy, pandas and
Matplotlib) for processing data-frame, collections, cleaning and further preprocessing. The complete
module is implemented on Jupyter using the programming language as Python(3.7). The code
3- Manipulate the data-frame for unique description of the variables from data
4- Visualize the entry-set of datasets in different visual design charts and plots.
The dataset have total 110 null count and data-type belong to int and float type where prediction class is
10
Fig: Description of the complete data-frame for prediction
• unified training approach — same loss (MSE), Linear regression model as base model
• The train set and test set split in 80:20 and using the cross fold the model get accuracy above 98%
• No null entries in the dataset apparently state of art cleaning technique method in Pandas library
11
Fig: d- plot of gate score vs college tier
It specifies from the plot gate score and cgpa plays a major role in deciding admissions for
Apart from that gate score is directly proportion to college tier. Tier-1 have better score
than tier-2 and so on. Random conversion is there for some distributed points.
Letter of recommendation is also leads with a key role for MS admission where research-
oriented work. For the research work LOR is key factor along with the gate score.
Linear regression model performs with a state of art technique and generates a prediction
score of almost (100%) with fine-tune some hyper-parameter and reducing the outliers
(few one’s). The hypothesis generated for the dependent variable along the feature set
class tends to actual conversion score of (t-test > 0.5) on the data.
12
EVALUATION
The project is evaluated on the state of art design pattern of the machine learning algorithms. Using the
linear regression model data approaches the score of >98% with approx loss on validation with +- 2%.
The LR model approaches with best prediction score. By fine-tune some-parameters and ensemble
model get the best predicted score. There is no difference between the actual and predicted result.
Apparaently using the SVM model the score not effected much with classification accuracy of 98%.
Both the model performs at their best and core prediction with the best accuracy rate. Some fine-tune the
parameters model accuracy with highest. Loss function and confusion matrix not present at the higher
13
DISCUSSION
To examine the admission prediction rate of a student in higher studies using the complexity analysis of
various machine learning algorithms(SVM, LDA) and their performance measure on the selected dataset
(students various factos including marks in competitive exams, class scores, projects and many more).
This problem is more tend to a classification problem in speciality of the defined aim. Using the machine
learning we critically predict the result set and generate analysis on the outcome result. Machine learning
techniques helps to predict results such that students gets a probalistic to insights on their admissions and
the idea of major focusing features which are deciding factor of admission.
Using the Linear Regression model the accuracy attained it best for the prediction of sores based on a
various features for the admissions in the higher program. The ML based approaches works it best when
data as it’s desired limit. The ML algorithms are tends to more optimize towards the real problems with
14
CONCLUSION
Machine learning algorithms are works best with defined-sets. In the prediction of admissions for higher
studies the state of art machine learning algorithms works better and predicted the result with better
accuracy. Built-in support of Scikit-learn makes models to predict, analyze and defines the result scores
• Using the LR model performs it’s best and predicts with the accuracy score > 98%.
• The state-of-art principle approaches the best with the desired data-sets
• The data-cleaning and pre-processing is also a defined factor of model-tune and prediction with
higher percent.
• The machine learning model defines in the given sets with depending upon the feature set and
class to predict.
15
REFERENCES
H.Brodersen
[5]A Machine Learning Approach for Graduate Admission Prediction. HananAlGhamdi, Amal. IVSP
'20: Proceedings of the 2020 2nd International Conference on Image, Video and Signal Processing -
March 2020
16