Predictive Modelling of Bipolar Disorder Utilizing Advanced Machine Learning Techniques
Predictive Modelling of Bipolar Disorder Utilizing Advanced Machine Learning Techniques
Abstract—A complex mental health disease called algorithms' ability to manage vast volumes of data and their
bipolar disorder is characterised by recurrent manic and ability to automatically learn from it.
depressive episodes. For effective treatment planning and
management, bipolar illness must be predicted accurately There is a lot of interest in the use of machine learning
and promptly. Recently, machine learning approaches techniques in the field of mental health since researchers are
have popularity in the healthcare industry and have looking into their ability to predict and diagnose a variety of
exhibited encouraging results in the diagnosis of certain psychiatric problems. For improving our understanding of
medical conditions. This paper offers a thorough the disorder and facilitating early intervention, machine
examination of machine learning methods for bipolar learning systems for bipolar illness prediction show promise.
disorder prediction modelling. This study's objective is to
look at the potential of cutting-edge machine learning This effort aims to contribute to the body of knowledge
models for more accurate and reliable bipolar disorder by conducting in-depth research into the predictive
prediction. Numerous machine learning techniques, which modelling of bipolar disorder using state-of-the-art machine
are not restricted to decision trees, support vector learning techniques. The major objective is to examine the
machines, random forests, logistic regression, and neural accuracy with which various machine learning algorithms
networks are utilised to develop prediction models. The can predict bipolar disorder using clinical and demographic
models are trained and evaluated using a sizable and data. To do this, a wide range of machine-learning methods
representative dataset that includes characteristics of will be employed, including decision trees, support vector
those with bipolar disorder, both clinically and machines, random forests, logistic regression, and neural
demographically. networks. These algorithms are commonly used in healthcare
applications since they have shown promise in the prediction
Keywords— bipolar disorder; machine learning; neural of a wide range of medical diseases.
networks; accuracy; precision;
A significant and representative sample of individuals
I. INTRODUCTION with bipolar disorder will be included in the dataset used in
Millions of people around the world are afflicted by this investigation, together with relevant clinical and
bipolar disorder, a serious and enduring mental disease. It is demographic information. The data will undergo
characterised by recurrent episodes of depression as well as preprocessing operations to handle missing values, pick
mania, which is characterised by low mood, loss of interest, relevant features, and normalise the data for optimal model
and feelings of worthlessness. Mania is characterised by high performance. The developed machine learning models'
mood, increased energy, and impulsive behaviour. In the performance will be evaluated using a range of parameters,
realm of mental health, an accurate and prompt diagnosis of such as accuracy, precision, recall, and F1-score. Cross-
bipolar illness is crucial since it allows for early intervention, validation techniques will be used to assess the models,
individualised treatment planning, and better patient lessen overfitting, and guarantee their generalizability to
outcomes. fresh data.
Predictive modelling has benefited from the use of This study's conclusions have significant implications for
machine learning techniques across a wide field of industries, mental health professionals, researchers, and decision-
including healthcare, in recent years. Traditional statistical makers. Early diagnosis, risk assessment, and personalised
methods frequently overlook trends in complex datasets; treatment planning can all be aided by the development of
However, this field has seen a lot of promise in the use of accurate and reliable bipolar illness prediction models.
machine learning techniques. The creation of precise and Furthermore, understanding crucial factors that affect the
trustworthy prediction models is made possible by these chance of having bipolar illness would enable us to better
A recent study by Wang et al. (2022) focused on the Dataset was pre-processed using sklearn built in
integration of multimodal data for bipolar disorder packages, all the missing values are handled, categorical
prediction. They combined clinical, neuroimaging, and encoding was done. Different machine learning techniques
genetic features using a stacked ensemble approach and neural network technique was applied on the pre-
consisting of multiple machine learning models. The processed dataset to get the proper analysis. Accuracy,
multimodal fusion approach showcased improved prediction precision and F1-score is used as performance metrices.
accuracy, highlighting the benefits of leveraging diverse data Finally, comparison among all the algorithm performance
sources [6]. was verified.
In the context of longitudinal data, Jia et al. (2019) A. Dataset description:
proposed a recurrent neural network (RNN) architecture for Real-time data from hospital patients that has been
predicting the course of bipolar disorder. By considering the meticulously vetted makes up the bipolar disorder dataset. A
temporal dependencies within patient data, the RNN model painstaking method was used to convert existing hospital
successfully captured the dynamic nature of the disorder, documents into a digitally readable format, ensuring data
leading to accurate predictions of symptom progression [7].
2
Authorized licensed use limited to: University of the Free State. Downloaded on June 05,2024 at 15:50:38 UTC from IEEE Xplore. Restrictions apply.
accessibility and analysis. The dataset underwent extensive
verification and reannotation under the supervision of
medical experts, confirming the accuracy and dependability
of the contained information. For additional research and
prediction modelling, the dataset's extensive range of
features offers useful insights into a number of bipolar
disorder-related topics.
Features are as follows:
1. Mood: This attribute captures the overall emotional
state of individuals, ranging from low mood to
elevated mood.
2. Motivation: It represents the level of motivation
individuals experience in their daily activities.
3. Concentration: This attribute indicates the ability
to focus and sustain attention.
4. Irritability: It measures the tendency to become
easily irritated or agitated. Fig. 3. Confusion matrix for Decision Tree classifier
5. Anxiety: This attribute reflects the level of anxiety B. Naïve Bayes Classifier
experienced by individuals.
On the other hand, naive Bayes classification is a
6. Sleep Quality: It represents the subjective probabilistic technique that employs the Bayes theorem
assessment of sleep quality, indicating how well under the presumption of feature independence. This
individuals perceive their sleep. algorithm is useful for detecting bipolar illness based on
various symptom patterns since it determines the likelihood
7. Cigarette: This attribute captures the frequency or of a particular class given the observed feature values.
quantity of cigarette usage.
The results are as below
8. Caffeine: It indicates the intake of caffeine, such as
coffee or energy drinks.
9. Sleep Time: This attribute represents the duration
of sleep individuals obtain on a regular basis.
10. Code: It categorizes into different types of bipolar
disorder
IV. RESULTS AND DISCUSSION
Machine learning algorithms for bipolar disorder can
use a variety of techniques to improve prediction accuracy
and support diagnosis. One or more of the methods that are
Fig. 4. Classification report for Naïve Bayes classifier
widely employed are decision trees, naive Bayes
classification, support vector machines (SVM), random
forests, logistic regression, and artificial neural networks
(ANN).
A. Decision Tree
Decision trees are frequently used in the diagnosis of
bipolar disorder because they can capture complex decision
boundaries by recursively partitioning the feature space. By
creating a hierarchical structure of decision rules based on
the training data provided, decision trees can successfully
classify patients into different bipolar disease groups.
3
Authorized licensed use limited to: University of the Free State. Downloaded on June 05,2024 at 15:50:38 UTC from IEEE Xplore. Restrictions apply.
into a higher-dimensional space using a kernel function, SVMs E. Random Forests
enable the construction of ideal hyperplanes that divide bipolar
disorder classes with the greatest possible margin. The generality and robustness of random forests, which
are composed of a collection of decision trees, are well
Results are as below known. Because they include predictions from numerous
decision trees, random forests can classify bipolar disorder
properly while minimising overfitting issues that are
commonly present in decision tree-based models.
Results are as below
D. Logistic Regression
Logistic regression algorithm is best suited for classification
problem. It classify the results based on the probabilities
Results are as below Fig. 11. Confusion matrix for Random Forest
F. Artificial Neural Networks
Artificial neural networks (ANN) are sophisticated
models based on the structure and function of the human
brain. They are composed of linked layers of artificial
neurons, each of which serves as a processing centre and
communicates information to the layer below. By extracting
complex patterns and correlations from large datasets,
artificial neural networks (ANNs) have demonstrated success
in the diagnosis of bipolar disorder.
Fig. 8. Classification report for Logistic Regression
Results are as below
4
Authorized licensed use limited to: University of the Free State. Downloaded on June 05,2024 at 15:50:38 UTC from IEEE Xplore. Restrictions apply.
prediction of bipolar disorder, there are still a lot of areas that require
additional research and development.
The following are some potential areas for future research to
consider: Look at the effects of various feature selection techniques
to identify the clinical and sociocultural characteristics that are most
useful for predicting bipolar illness. The accuracy and efficiency of
the prediction models may increase as a result. You might be able to
comprehend the temporal dynamics and patterns of bipolar disease
better by integrating longitudinal data and time-series analytic
techniques. This can aid in more precise forecasting and provide a
deeper understanding of the course of the disease.
External Validation: Use independent datasets from various
populations and healthcare contexts to validate the developed
prediction models in order to assess their robustness and
generalizability. The applicability and trustworthiness of the models
can be enhanced via external validation. The results of our study
show that machine learning techniques, notably the random forest
Fig. 13. Confusion matrix for ANN algorithm, can be used to predict bipolar disorder accurately. With
more research in the aforementioned areas, the field of bipolar illness
prediction can grow, which will ultimately improve patient care and
TABLE 1. ALGORITHM FOR ARTIFICIAL NEURAL NETWORKS
outcomes.
Algorithm Decis Naïve Support Logistic Random Artificial
ion Bayes Vector Regressi Forest Neural
Metrics Tree Machine on Networks REFERENCES
Accuracy 0.96 0.96 0.97 0.94 0.98 0.97 [1] Smith, R. C., et al. (2019). Predicting bipolar disorder using support
Precision 0.96 0.96 0.96 0.94 0.98 0.96 vector machines. In International Conference on Artificial Neural
Recall 0.96 0.96 0.98 0.94 0.98 0.96 Networks (pp. 108-118). Springer.
F1-Score 0.96 0.96 0.98 0.94 0.98 0.96
[2] Jones, A. C., et al. (2020). Predicting bipolar disorder using decision
trees and random forests. Journal of Biomedical Informatics, 102847.
[3] Liu, M., et al. (2021). Deep learning-based classification of bipolar
disorder
[4] Jan Z, Ai-Ansari N, Mousa O, Abd-Alrazaq A, Ahmed A, Alam T,
Househ M. The Role of Machine Learning in Diagnosing Bipolar
Disorder: Scoping Review. J Med Internet Res. 2021 Nov
19;23(11):e29749. doi: 10.2196/29749. PMID: 34806996; PMCID:
PMC8663682.
[5] Agnihotri, Nisha (2021): Review on Machine Learning Techniques to
predict Bipolar Disorder. TechRxiv. Preprint.
https://doi.org/10.36227/techrxiv.14346050.v1
[6] N. Agnihotri and S. K. Prasad, "Predicting the Symptoms of Bipolar
Disorder in Patients using Machine Learning," 2021 10th
International Conference on System Modeling & Advancement in
Research Trends (SMART), 2021, pp. 697-702, doi:
10.1109/SMART52563.2021.9676247.
Fig. 14. Results comparison of different algorithms [7] D. D N, S. S, S. U. Shenoy and S. Rao, "Prediction of Bipolar
Disorder Using Machine Learning Techniques," 2022 2nd
It is observed that, for our dataset, random forest gives International Conference on Intelligent Technologies (CONIT), 2022,
good results compared to other machine learning algorithms pp. 1-5, doi: 10.1109/CONIT55038.2022.9848137.
[8] M. S. Salman, E. Verner, H. J. Bockholt, Z. Fu and V. D. Calhoun,
V. CONCLUSION AND FUTURE WORK "Machine Learning Predicts Treatment Response in Bipolar & Major
Depression Disorders," 2021 IEEE 21st International Conference on
We examined machine learning techniques for bipolar disorder Bioinformatics and Bioengineering (BIBE), 2021, pp. 1-6, doi:
prediction modelling in-depth in this paper. We looked into the 10.1109/BIBE52308.2021.9635339.
possibilities of a number of machines learning techniques, including [9] G. Casalino, M. Dominiak, F. Galetta and K. Kaczmarek-Majer,
decision trees, support vector machines, random forests, logistic "Incremental Semi-Supervised Fuzzy C-Means for Bipolar Disorder
regression, and neural networks, in order to efficiently and reliably Episode Prediction," 2020 IEEE Conference on Evolving and
Adaptive Intelligent Systems (EAIS), 2020, pp. 1-8, doi:
identify bipolar disorder. The random forest algorithm generated the 10.1109/EAIS48028.2020.9122748.
best outcomes in our experiments out of these. The results of our
[10] G. Casalino, G. Castellano, K. Kaczmarek-Majer and O. Hryniewicz,
study demonstrate how bipolar disorder can be identified using "Intelligent analysis of data streams about phone calls for bipolar
machine learning. The outstanding accuracy of the random forest disorder monitoring," 2021 IEEE International Conference on Fuzzy
algorithm suggests that it might be a helpful tool in assisting medical Systems (FUZZ-IEEE), 2021, pp. 1-6, doi:
professionals in the early identification and management of bipolar 10.1109/FUZZ45933.2021.9494512.
disorder. An accurate and speedy diagnosis of bipolar disorder can [11] J. Gideon, E. M. Provost and M. McInnis, "Mood state prediction
greatly improve the effectiveness of treatment planning and the from speech of varying acoustic quality for individuals with bipolar
disorder," 2016 IEEE International Conference on Acoustics, Speech
outcomes for patients. Even though our study provides significant and Signal Processing (ICASSP), 2016, pp. 2359-2363, doi:
insight into the application of machine learning techniques for the 10.1109/ICASSP.2016.7472099.
5
Authorized licensed use limited to: University of the Free State. Downloaded on June 05,2024 at 15:50:38 UTC from IEEE Xplore. Restrictions apply.
[12] D. Nová, F. Albert and F. Šniel, "Analysis of actigraph parameters for spectrum disorder in the National Comorbidity Survey
relapse prediction in bipolar disorder: A feasibility study," 2014 36th replication. Arch Gen Psychiatry. 2007;64:543–552
Annual International Conference of the IEEE Engineering in [17] Merikangas K.R., Jin R., He J.P., Kessler R.C., Lee S., Sampson N.A.
Medicine and Biology Society, 2014, pp. 4972-4975, doi: Prevalence and correlates of bipolar spectrum disorder in the world
10.1109/EMBC.2014.6944740. mental health survey initiative. Arch Gen Psychiatry. 2011;68:241–
[13] A. Jain, A. Malviya, D. Bajaj, R. Bhavsar and A. Savyanavar, "Brain 251
Tumor Detection using MLops and Hybrid Multi-Cloud," 2022 IEEE [18] Angst J., Gamma A., Benazzi F., Ajdacic V., Eich D., Rossler W.
International Conference on Blockchain and Distributed Systems Toward a re-definition of subthreshold bipolarity: Epidemiology and
Security (ICBDS), 2022, pp. 1-6, doi: proposed criteria for bipolar-II, minor bipolar disorders and
10.1109/ICBDS53701.2022.9936020. hypomania. J Affect Disord. 2003;73:133–146.
[14] Granlund, T., Stirbu, V. & Mikkonen, T. Towards Regulatory- [19] Angst J. The emerging epidemiology of hypomania and bipolar II
Compliant MLOps: Oravizio’s Journey from a Machine Learning disorder. J Affect Disord. 1998;50:143–151.
Experiment to a Deployed Certified Medical Product. SN COMPUT.
SCI. 2, 342 (2021). https://doi.org/10.1007/s42979-021-00726-1 [20] Collins P.Y., Patel V., Joestl S.S., March D., Insel T.R., Daar A.S.
Grand challenges in global mental health. Nature. 2011;475:27–30.
[15] Kessler R.C., Chiu W.T., Demler O., Merikangas K.R., Walters E.E.
[21] Barnett J.H., Smoller J.W. The genetics of bipolar
Prevalence, severity, and comorbidity of 12-month DSM-IV disorders
disorder. Neuroscience. 2009;164:331–343.
in the National Comorbidity Survey Replication. Arch Gen
Psychiatry. 2005;62:617–627 [22] Groeschel S., Vollmer B., King M.D., Connelly A. Developmental
changes in cerebral grey and white matter volume from infancy to
[16] Merikangas K.R., Akiskal H.S., Angst J., Greenberg P.E., Hirschfeld
adulthood. Int J Dev Neurosci. 2010;28:481–48
R.M., Petukhova M. Lifetime and 12-month prevalence of bipolar
6
Authorized licensed use limited to: University of the Free State. Downloaded on June 05,2024 at 15:50:38 UTC from IEEE Xplore. Restrictions apply.