WIA1006 Report (OrionX)
WIA1006 Report (OrionX)
WIA1006 Report (OrionX)
ASSIGNMENT REPORT
TABLE OF CONTENT
REPORT EVALUATION 1
4.0 Methodology 6
REPORT EVALUATION 2
2
OrionX WIA1006
REPORT EVALUATION 1
3
OrionX WIA1006
To implement the logistic regression model, we find datasets from Kaggle. Due to
datasets with specific features being limited, we predict our model will have an accuracy of
85% for training, 80% for validation, and 80% for test. In brief, our depression indicator
model’s task (T) is to predict the depression level of students, experience (E) is learned from
the dataset with the same features, and performance (P) is the accuracy of the depression
level of the students.
4
OrionX WIA1006
5
OrionX WIA1006
4.0 Methodology
In our project, we employed a variety of techniques to complete the given tasks. All
of the strategies listed below have been acknowledged and debated among the members of
our group to ensure the best possible outcome.
Firstly, we obtained our dataset from Kaggle, a well-known platform that allows data
scientists and machine learning practitioners to explore and publish data sets. We did not go
for manual data collection since we wanted to have over 800 datasets for a more accurate
training model. It's impossible to get such a large amount of information via surveys from our
friends and families alone in a short amount of time.
Next, we cleaned the data and filtered out some features that we felt were
unnecessary. All of the features selected were also backed up by evidence. Details of how and
what we used to select the features will be discussed later on in the report. The data obtained
from Kaggle was qualitative. Therefore, we transformed the categorical data into numbers
since machine learning models can only be trained with numeric data. Our aim was to
categorise the output into moderate or severe depression, thus we applied logistic regression
to train the data.
Before analysing, we split the data into training, validation, and test sets. Data
splitting was performed to avoid overfitting, a condition where the learned hypothesis may fit
the training set very well but fail to generalise new examples. Moving on, we used the
confusion matrix to evaluate the performance of our classification model. For our model to be
considered successful, we aim to have an accuracy of higher than 80%.
6
OrionX WIA1006
First and foremost, we cleaned the data that we acquired by removing a few features
that were already available in the dataset because we felt they were unnecessary for our
project. Besides, we also allocated the features that result in a value that is less than or equal
to two as more likely to be mildly depressed and the features that result in a value that is
more than or equal to three is more likely to be severely depressed.
Moving on, after doing more data cleaning and modifying the data so that it is
arranged in an orderly manner, we can finally proceed to train our model using the Machine
Learning algorithm that we decided upon which is logistic regression. Logistic regression
basically estimates the probability of something happening and in this case, it is classifying
between mild and severe depression. Logistic regression aims to distinguish between classes.
Unlike a generative algorithm, such as Naive Bayes, it cannot generate information such as
an image of the class that it is trying to predict. Logistic regression also helps us to assess
which input variable is responsible for the greatest change in predicted value. What logistic
regression does that other algorithms don’t is that it outputs well-calibrated probabilities
7
OrionX WIA1006
along with classification results. This is an advantage over models that only give the final
classification as results. If a training example has a 95% probability for a class, and another
has a 55% probability for the same class, we get an inference about which training examples
are more accurate for the formulated problem. Due to its simple probabilistic interpretation,
the training time of the logistic regression algorithm comes out to be far less than most
complex algorithms, such as an Artificial Neural Network.
Proceeding to the explanation about the features that we used. Depression has been
found in studies to cause people to think more slowly than others. A person who is
depressed may exhibit slow speech or difficulty understanding and registering
information. Based on a recent study, participants were asked to count backwards from one
hundred by seven. People with depression are proven to be slower in doing so and make more
mistakes. (Tracy, 2022) People who are suffering from depression are also most likely to have
suicidal thoughts. According to a research paper, there is a direct connection between
depression and self-harming behaviours among adolescents. It is commonly used by
victims as a coping technique, a source of comfort, a way to regulate moods, self-punishment,
and sensation seeking.
Furthermore, the feature of whether the participants had part-time or full-time work
was used since juggling between being a full-time college student and full-time employment
is one of the most difficult scenarios a student may face. Time management for students
might be tough since they have to concentrate on achieving good grades and at the same time
performing well at work. This can take a huge toll on their mental health. (Santoro, 2021)
The amount of time one spends studying can also be an indicator of depression. Studies
show that there is a correlation between study habits and depression. The more one spends on
studying, the less likely one may be to suffer from depression. This is due to the fact that
gaining excellent marks is one of the most concerning issues for students. When a student's
study habits are well-planned and disciplined, they will have sufficient time to prepare for
upcoming exams.
Next, the number of electronic gadgets and time spent on them are considered
contributing factors to depression among students. Excessive usage of digital gadgets was
found to promote depression among users in a 2017 study. Teens and adults who spent more
than six hours a day staring at screens were substantially more likely to suffer from moderate
8
OrionX WIA1006
to severe depression than those who spent less time on them. (Madhav et al., 2017)
According to another study, adolescents who used electronic devices for more than two hours
are 1.71 times more likely to have depression. (Al Salman et al., 2020) Spending so much
time alone in front of a screen can exacerbate feelings of loneliness and disrupt true human
interactions. Lack of real human interactions adds to people's feelings of depression, and their
attempts to alleviate depression through screen time can create a vicious cycle that only adds
fuel to the fire in the context of suffering from depression.
In addition to that, the feature of having little interest or pleasure in doing things or
more commonly known as anhedonia is yet another clear symptom of a person suffering
from depression. Social withdrawal, lack of interest in previous hobbies that one used to
enjoy, and also diminished pleasure derived from daily activities are just some of the clear
symptoms that show one could be suffering from severe depression. It just goes to show how
a serious mental health condition like anhedonia can affect a person to the fact that nothing
brings them joy in their life anymore. What makes this worse is that it is extremely tough to
bring them back to their normal state of mind and make them enjoy the little things in life. It
is undoubtedly a core symptom of major depressive disorder.
Moreover, trouble falling or staying asleep or sleeping too much is yet another
indicator of someone suffering from depression. Depression and sleep problems are closely
linked where people with insomnia may have a tenfold higher risk of developing depression
than people who get a good night’s sleep. Among people with depression, 75% have trouble
falling asleep or staying asleep. Sleep issues associated with depression include insomnia,
hypersomnia, and obstructive sleep apnea It is believed that about 20% of people with
9
OrionX WIA1006
depression have obstructive sleep apnea and about 15% have hypersomnia. Many people with
depression may go back and forth between insomnia and hypersomnia during a single period
of depression.
Other than that, we can also associate depression with one’s inner thoughts like when
one feels bad about themselves or thinks themself is a failure that lets themselves and their
family down. This is because people who feel bad about themselves usually are in excessive
self-blaming conditions and always think negatively about themselves. (Zahn et al., 2015). In
the long term, people who are constantly experiencing the same exact condition will result in
a decrease in self-worth and feel hopeless as they did not ask for help from family and
friends. A long-term negative mindset will affect their mental health and bring them to severe
depression. This will disturb their daily tasks and life because the emotion of depressed
people is unstable. Not just that, their career life and relationship will also be largely affected
10
OrionX WIA1006
due to their negative mindset and low effectiveness. This will form a vicious cycle which
makes the patients even more depressed and finally result in serious psychological disease or
suicidal thoughts.
11
OrionX WIA1006
Figure 1
Figure 2
From the training of the confusion matrix above (Figure 2), we can see when the
actual target is mildly depressed, our model successfully predicts 91% of them as mildly
depressed correctly but wrongly predicts 8.9% of them as severely depressed. While when
the actual target is severely depressed, our model successfully predicts 79% of them as
severely depressed correctly but 21% of them as mildly depressed.
12
OrionX WIA1006
Figure 3
While from the validation confusion matrix (Figure 3), we can know when the valid
target is mildly depressed, our model successfully predicts 92% of them as mildly depressed
correctly but 8.2% of them as severely depressed which is incorrect. When the valid target is
severely depressed, our model is able to predict 76% of them as severely depressed correctly
and 24% of them as mildly depressed which is inaccurate.
Figure 4
While from test confusion matrix (Figure 4), we can see that our model can predict
85% of the tested target correctly as mildly depressed (actual mildly depressed, predict mildly
depressed) but 15% incorrect (actual mildly depressed, predict severely) and 78% of the
tested target correctly as severely depressed (actual severely depressed, predict severely
depressed) but 22% incorrect (actual severely depressed, predict mildly depressed).
13
OrionX WIA1006
Figure 5
While for the F1 score, our model achieved a score of 80.30% for training, 78.74%
for validation, and 75.18% for test.
Figure 6
14
OrionX WIA1006
Figure 7
To make predictions with new data by using our logistic regression model, we create
two sets of new data. For data from Figure 6, we create a set of data with gender = 0
(Female), age = 1 (19 to 24 years old), has higher average marks on ‘Little interest or
pleasure in doing things’, ‘Trouble falling or staying asleep, or sleeping too much’, ‘Feeling
tired or having little energy’, ‘Poor appetite or overeating’, ‘Feeling bad about yourself or
that you are a failure or not have let yourself or your family down’, ‘Trouble concentrating on
things, such as reading the newspaper or watching television’, ‘Moving or speaking so slowly
that other people could have noticed or being so restless that you have been moving around a
lot more than usual’, ‘Thoughts that you would be better off dead or of hurting yourself in
some way’, ‘CurrentJob’, ‘StudyhourPerDay’, ‘Number of gadget’, ‘hour spend on social
media’, and lower ‘GPA’. After we pass the new dataset into the function we defined, our
logistic regression model classifies the dataset as severely depressed (1) with a probability of
0.9130573807461071 and for the dataset with lower average marks and higher GPA, our
model classifies the dataset as mildly depressed (0) with a probability of
0.9813853473574341. Both of the predictions match our hypothesis and the evidence we
found.
15
OrionX WIA1006
This project has undoubtedly improved all of our groupmates’ understanding of the
various Machine Learning algorithms and figuring out how to implement the algorithm to our
project. This project has truly been an eye-opener for us and we all agree that this project has
helped us explore the Python language thoroughly. We would like to thank all the lecturers
involved including Dr Aznul Qalid and Dr Erma Rahayu for assigning us this project and for
always clearing our doubts along the way. Thank you also for teaching us this brand new
subject with the utmost patience from the very start. Although the project has been completed
successfully, there is a lot of room for improvement. Here are some of the suggestions for
improvement for any future Machine Learning projects that we handle.
16
OrionX WIA1006
trying different algorithms, we can eventually identify which ones work best for our data and
then also use that information to improve the accuracy of your models. We can also
cross-validate with multiple algorithms on the same dataset and then compare their accuracy
against each other as there are a lot of Machine Learning algorithms out there.
Hyperparameter Tuning
Hyperparameters are the parameters in Machine Learning models that determine how
they work. All these parameters can include things like the number of layers in a deep neural
network or how many trees there should be in an ensemble model. These parameters take
some values to perform their task under the model. This is where cross-validation is equally
helpful. By splitting our data into training and test sets, we can try different combinations of
hyperparameters on the training set and then see how well they perform on the test set. What
this does is that it helps us to find the best combination of hyperparameters for our model.
Grid search, which is a method of determining the best combination of hyperparameters for
data, is another option. Grid search works by trying out all of the potential parameter
combinations until it discovers one that gives us the best results based on our metric (e.g.,
accuracy). After that, we can train our model using that set of hyperparameters.
17
OrionX WIA1006
REPORT EVALUATION 2
After that, we dropped unnecessary columns that are not related to the
depression, which are ‘Educational Level’ and ‘Which of the following best describes
your term-time accommodation?’. Apart from that, we also converted categorical
data (qualitative) to numeric data (quantitative) since machine learning models can
only be trained with numbers. This basically makes sure that all the data can be
analysed straightforwardly and it is less prone to error. Additionally, we identified and
set ‘DepressionLevel’ as our actual target column and created a list of input columns
for the other features.
18
OrionX WIA1006
Figure 8
It's always a good idea to look at the distributions of various columns and see
how they relate to the target column before training our machine learning model.
Thus, we explored and visualised the data using histogram and checking if there are
correlations between the different characteristics that we obtained.
Figure 9
Histogram above shows that students between 19 to 24 years old suffered from severe
depression.
19
OrionX WIA1006
Figure 10
Histogram above shows that, the population of students between a GPA of 3.2 to 3.39
are the ones that are severely depressed.
Data Splitting
After that, we splitted our data into three classes of data sets, which are
training data set, cross-validation data set and test data set. The reason why this
should be done is the scenario when the test data set ends up fitting well with new
features that are developed based on the evaluation of the test data set error. First, we
splitted the data into 80% temporary train and validation sets and 20% for test set
using train_test_split function from the scikit-learn library. Then, the temporary train
and validation sets were splitted into 25% cross-validation set and 75% train set. In
the end, the original dataset was splitted into 60% for train, 20% for cross-validation
and 20% for test, which consisted of 600 rows, 200 rows and 200 rows respectively.
Then, we created inputs and targets for the training, validation and test sets for further
processing and model training.
20
OrionX WIA1006
Features Scaling
Next, we used feature scaling (Min-Max Normalization) to bring all values to
the same magnitudes, thus, all the independent features can be presented in the data in
a fixed range within 0 and 1. This improves the accuracy and integrity of our data
while ensuring that our database is easier to navigate.
21
OrionX WIA1006
Model Evaluation
Evaluating a model is a core part of building an effective machine learning
model. Therefore, we checked the machine created against our evaluation data set that
contains inputs that the model does not know and verified the precision of our
training.
Scikit-learn library has a few built-in methods that can be used to access the
model performance. First, it is the model accuracy score. Model accuracy is defined
as the ratio of true positives and true negatives to all positive and negative
observations. The percentage of accuracy scores for train dataset, cross-validation
dataset and test dataset are 86.83%, 86.50% and 82.50% respectively. Note that the
maximum percentage of accuracy score is 100%. We finally obtained an 82.50%
accuracy score on test datasets, meaning that we have good confidence in the results
that our model gives us.
22
OrionX WIA1006
Second, our model was also evaluated with the F1 score. F1 score is a machine
learning model performance metric that gives equal weight to both the precision and
recall for measuring its performance in terms of accuracy, making it an alternative to
accuracy metrics. The percentage of F1 scores for train dataset, cross-validation
dataset and test dataset are 80.30%, 78.74% and 75.18% respectively. Note that the
maximum percentage of F1 score is 100%.
23
OrionX WIA1006
Confusion Matrix
In addition, a confusion matrix was also plotted as well. It is useful because
they give direct comparisons of values like True Positives, False Positives, True
Negatives and False Negatives. We can also visualize the breakdown of correctly and
incorrectly classified inputs using a confusion matrix. At last, we are able to define
the performance of our algorithm.
Figure 11
24
OrionX WIA1006
Model Deployment
After making sure all the data is ready for deployment, we prepared for
container deployment to deploy our code to production. We took the validated
features in a staging environment and deployed them into the production environment,
so they are readied for release.
25
OrionX WIA1006
In our project, we used binary logistic regression, which is a statistical method used to
predict the relationship between a dependent variable and an independent variable.
Dependent variable is a binary variable that accepts two values for example either 0 or 1, true
or false, yes or no, etc. In the logistic regression model, firstly, we will take the linear
combination or the weighted sum of the input features. We then apply the sigmoid function to
the result to obtain a number between 0 and 1. This number of 1 or 0 represents the
probability of the input being classified as ‘yes’ or ‘no’.
Figure 12
26
OrionX WIA1006
Figure 13
Since our aim is to categorise the output into mild or severe depression, thus we
picked logistic regression above all the other machine learning algorithms to solve the
problem. The dependent variable in our indicator accepts either 0 or 1, where 0 denotes a
mild case of depression and 1 denotes a severe case of depression. The elements that caused
depression were the input features that we employed. This algorithm is straightforward to
comprehend, easy to construct, and train. Besides, it is appropriate for small datasets and has
a high level of accuracy. Finding correlations between features is also beneficial while using
this algorithm.
One of the most important phases in the pre-processing of data prior to developing a
machine learning model is feature scaling. Scaling may make the difference between a bad
and a good machine learning model. Why do we need scaling? Machine learning algorithms
just look at numbers, and if there is a significant difference in range, such as a few ranging in
the thousands against a few ranging in the tens, it assumes that greater ranging numbers have
some form of superiority. As a result, these more significant numbers start playing a more
decisive role while training the model. Feature scaling is needed to bring every feature in the
same footing without any upfront importance. Therefore in our project, we implemented the
feature scaling technique. Due to the differences in ranges of features, each feature will have
a distinct step size. We scaled the data before feeding it to the model to guarantee that the
gradient descent moves smoothly towards the minima and that the gradient descent steps are
updated at the same rate for all features. It is advisable to get every feature into
approximately a (0,1) or (-1,1) range. Optimization algorithms also work better in practice
with smaller numbers.
27
OrionX WIA1006
Hyperparameters is a way to tailor the behaviour of the algorithm used to the specific
dataset. Hyperparameters are not the same as parameters, which are the internal coefficients
or weights found by the learning procedure for a model. In our logistic regression model, we
do not have critical hyperparameters to tune. Hence, we considered our algorithm used in
context of their scikit-learn implementation (Python) by using solver to get the best
performance for our indicator. Since we have only 1000 datasets, the ‘liblinear’ library is
used.
28
OrionX WIA1006
Figure 14
After we gather, clean, pre-process and wrangle our data, the next step we do is to
feed it to an outstanding logistic regression model and of course, get an output in
probabilities. It is important for us to use cross-validation to evaluate the machine learning
model by training it on 20% of our available input data. This technique is used to detect
overfitting. For example, if our model has been trained too well on training data, it will fail to
generalise a pattern and will make inaccurate predictions when given new data. Thus, we had
split our complete data into three sets (train, validation and test). Cross-validation allows us
to tune hyperparameters with only our training set. This allows us to keep the test set as a
truly unseen dataset for selecting our final model. Then, we observe how well our model will
perform on the new test dataset.
Figure 15
29
OrionX WIA1006
Figure 16
30
OrionX WIA1006
Import Libraries
# import library
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
import plotly.express as px
31
OrionX WIA1006
df
Data Visualization
# data visualization
px.histogram(df,x='Gender',title='Gender vs Depression
level',color='DepressionLevel')
px.histogram(df,x='GPA',title='GPA vs Depression
level',color='DepressionLevel')
#Categorical feature
df.Gender=df.Gender.map({'Female':0,'Male':1})
df.Age=df.Age.map({'18 years or less':0,'19 to 24 years':1,'25 years
and above':2})
df.CurrentJob=df.CurrentJob.map({'No':0,'Part Time':1,'Full Time':2})
df.StudyHourPerDay=df.StudyHourPerDay.map({'1 - 2 hours':0,'2 - 4
hours':1,'More than 4 hours':2})
df.NumOfGadget=df.NumOfGadget.map({'None':0,'1 - 3':1,'4 - 6':2,
'More than 6':3})
32
OrionX WIA1006
df.SocialMediaSpend=df.SocialMediaSpend.map({'Not Applicable':0,'1 -
2 Hours':1,'2 - 4 Hours':2, 'More than 4 Hours':3})
# 20% of data will be put into test_df, the remaining put into
train_val_df
train_val_df, test_df = train_test_split(df,
test_size=0.2,random_state=42)
print(train_df.shape)
print(val_df.shape)
print(test_df.shape)
# input columns are from first column to the second last column
input_cols = list(train_df.columns)[0:-1]
target_cols = 'DepressionLevel'
print(input_cols)
print(target_cols)
33
OrionX WIA1006
train_inputs = train_df[input_cols].copy()
train_targets = train_df[target_cols].copy()
val_inputs = val_df[input_cols].copy()
val_targets = val_df[target_cols].copy()
test_inputs = test_df[input_cols].copy()
test_targets = test_df[target_cols].copy()
# check if it is correct
train_targets
# identify which of the columns are numerical and which ones are
categorical
Numeric_cols
=train_inputs.select_dtypes(include=np.number).columns.tolist()
Features Scaling
# Scaling Numeric Features to a (0,1) range
df[numeric_cols].describe()
scaler.fit(df[numeric_cols])
34
OrionX WIA1006
print('Minimum')
list(scaler.data_min_)
print('Maximum')
list(scaler.data_max_)
# separately scale the training, validation and test sets using the
transform method of scaler
train_inputs[numeric_cols]
= scaler.transform(train_inputs[numeric_cols])
val_inputs[numeric_cols] = scaler.transform(val_inputs[numeric_cols])
test_inputs[numeric_cols]
= scaler.transform(test_inputs[numeric_cols])
train_inputs.to_parquet('train_inputs.parquet')
val_inputs.to_parquet('val_inputs.parquet')
test_inputs.to_parquet('test_inputs.parquet')
%%time
pd.DataFrame(train_targets).to_parquet('train_targets.parquet')
35
OrionX WIA1006
pd.DataFrame(val_targets).to_parquet('val_targets.parquet')
pd.DataFrame(test_targets).to_parquet('test_targets.parquet')
%%time
train_targets = pd.read_parquet('train_targets.parquet')[target_cols]
val_targets = pd.read_parquet('val_targets.parquet')[target_cols]
test_targets = pd.read_parquet('test_targets.parquet')[target_cols]
model = LogisticRegression(solver='liblinear')
print(model.coef_.tolist())
36
OrionX WIA1006
prediction.
print(model.intercept_)
Model Evaluation
# use the trained model to make predictions on the training, test
X_train = train_inputs[numeric_cols]
X_val = val_inputs[numeric_cols]
X_test = test_inputs[numeric_cols]
train_preds = model.predict(X_train)
train_preds
train_targets
train_probs = model.predict_proba(X_train)
train_probs
model.classes_
train_preds = model.predict(X_train)
train_preds
train_targets
train_probs = model.predict_proba(X_train)
train_probs
model.classes_
'''
37
OrionX WIA1006
X_input = input_df[numeric_cols]
38
OrionX WIA1006
12.0 References
Simmons, W. K., Burrows, K., Avery, J. A., Kerr, K. L., Bodurka, J., Savage, C. R., &
https://doi.org/10.1176/appi.ajp.2015.15020162
Zahn, R., Lythe, K. E., Gethin, J. A., Green, S., Deakin, J. F., Young, A. H., & Moll, J. (2015,
Nov 1). The role of self-blame and worthlessness in the psychopathology of major
https://doi.org/10.1016/j.jad.2015.08.001
Maxwell, M. A., Cole, D.A. (2009, Jan 7). Clinical Psychology Review. Weight change and
https://www.sciencedirect.com/science/article/abs/pii/S0272735809000075
Theobald, M. (2013, January 23). Depression, Memory Loss, and Concentration. Depression
https://www.everydayhealth.com/hs/major-depression/depression-memory-loss-and-c
oncentration/
Griffey, H. (2019, Aprill 11). How Do Depression and Anxiety Affect Concentration? How
https://welldoing.org/article/how-do-depression-anxiety-affect-concentration
39
OrionX WIA1006
Al Salman, Z. H., Al Debel, F. A., Al Zakaria, F. M., Shafey, M. M., & Darwish, M. A.
(2020). Anxiety and depression and their relation to the use of electronic devices
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6984035/
Madhav, K.C., Prasad, S. P., & Sherchan, S. (2017). Association between screen time and
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5574844/
Santoro, I. (2021, December 6). How being a full-time student and employee affects mental
https://www.anchorweb.org/post/how-being-a-full-time-student-and-employee-affects
-mental-health
Tracy, N. (2022). Depression and Slow Thinking (Reduced Processing Speed). HealthyPlace.
https://www.healthyplace.com/depression/symptoms/depression-and-slow-thinking-re
duced-processing-speed
B, H. N. (2020, June 1). Confusion matrix, accuracy, precision, recall, F1 score. Medium.
https://medium.com/analytics-vidhya/confusion-matrix-accuracy-precision-recall-f1-s
core-ade299cf63cd
40
OrionX WIA1006
https://scikitlearn.org/stable/modules/generated/sklearn.linear_model.LogisticRegress
ion.html
Stefanovic, S. (2021, April 2). #005 PyTorch - Logistic regression in PyTorch - Master data
https://datahacker.rs/005-pytorch-logistic-regression-in-pytorch/
Tune Hyperparameters for classification machine learning algorithms. (2020, August 27).
https://machinelearningmastery.com/hyperparameters-for-classification-machine-learn
ing-algorithms/
Zeta, C. Z. (2021, March 7). 5 Effective Ways to Improve the Accuracy of Your Machine
https://towardsdatascience.com/5-effective-ways-to-improve-the-accuracy-of-your-ma
chine-learning-models-f1ea1f2b5d65
WebMD Editorial Contributors. (2021, March 25). What to Know About Depression in
https://www.webmd.com/depression/what-to-know-about-depression-in-college-stude
nts
Verma, Y. V. (2022, May 6). How to improve the accuracy of a classification model?
https://analyticsindiamag.com/how-to-improve-the-accuracy-of-a-classification-mode
l/
41
OrionX WIA1006
Shin, T. S. (2020, September 23). How I Consistently Improve My Machine Learning Models
https://www.kdnuggets.com/2020/09/improve-machine-learning-models-accuracy.htm
https://www.healthline.com/health/depression/anhedonia
Ahmed, G., Negash, A., Kerebih, H., Alemu, D., & Tesfaye, Y. (2020, July 28). Prevalence
https://ijmhs.biomedcentral.com/articles/10.1186/s13033-020-00384-5
Burry, M. B. (2020, April 13). Why depression makes you tired and how to deal with fatigue.
Insider.
https://www.insider.com/guides/health/mental-health/why-does-depression-make-you-
tired
Connection.
https://www.hopkinsmedicine.org/health/wellness-and-prevention/depression-
and-sleep-understanding-the-connection
OpenGenus IQ.
https://iq.opengenus.org/advantages-and-disadvantages-of-logistic-regression
42