Cloud Burst Prediction System Using Machine Learning

Cloud Burst Prediction System using Machine

2024 OPJU International Technology Conference (OTCON) on Smart Computing for Innovation and Advancement in Industry 4.0

Dr. G. Nagappan
Herin Shani.S Department of CSE
Department of CSE Saveetha engineering college(Autonomus)
Saveetha engineering college(Autonomus) Chennai, India
Chennai, India Nagappan.cse@saveetha.ac.in

Abstract—This abstract presents an innovative approach that outcomes of utilizing GAF and CNN for more accurate, timely,
leverages the Gramian Angular Field (GAF) in conjunction with and reliable cloudburst predictions and recruiters are well-
Convolutional Neural Networks (CNN) to improve the accuracy equipped to succeed in the digital recruitment era.
and reliability of cloudburst prediction systems. The utilization of
GAF and CNN represents a breakthrough in modeling the II. RELATED WORKS
complexities inherent in meteorological data. GAF, with its
generative capabilities, constructs synthetic data closely
Cloudburst revolutionizes serverless computing by enabling
resembling the underlying meteorological distributions. CNN, stateful Python programming through Anna for state sharing and
renowned for its proficiency in spatial data analysis, is adept at caches, significantly reducing state-management issues and
recognizing intricate patterns within meteorological images. The enhancing serverless consistency across applications[1].
application of GAF and CNN in cloudburst prediction systems Utilizing AI and data science, an advanced system aims to
signifies a significant advancement in the field. By effectively predict destructive cloudbursts in hilly areas by analyzing
recognizing and generating synthetic data representative of pressure, humidity, and temperature, providing early warnings
meteorological complexities, this approach has the potential to to vulnerable regions, inspired by Koottickal village events[2].
significantly improve the precision and lead time of cloudburst
Study assesses NETRA model alerts' accuracy for Western
predictions, thereby enhancing early warning systems and disaster
Himalayan cloudbursts in Uttarakhand, notably successful in
Chamoli, Rudra Prayag, and Uttarkashi districts. May emerges
Keywords-Meteorological Data Analysis, Machine Learning in as the most critical month. Co-occurrence analysis emphasizes
Meteorology, Early Warning Systems, Disaster Preparedness, Pauri and Uttarkashi[3]. Cloudburst offers a deployable remedy
Extreme Weather Forecasting, Data-driven Prediction using forward error correction (FEC) over multipath, reducing
,Meteorological Image Analysis, Enhanced Prediction Accuracy. datacenter short flow latency. It spreads FEC-coded packets
proactively, cutting message completion time significantly
I. INTRODUCTION compared to DCTCP and PIAS[4]. Uttarakhand's sparse rain
gauge networks heighten vulnerability to disasters. Comparing
This project focuses on the innovative approach of using TRMM satellite rainfall with gauge data reveals good
Gramian Angular Field (GAF) in combination with agreement, highlighting the necessity for enhanced satellite
Convolutional Neural Networks (CNN) for cloudburst retrieval algorithms incorporating local factors and
prediction.. By combining GAF and CNN, this approach aims topography[5]. This study proposes a Convolutional Neural
to harness the strengths of generative modeling and deep Network (CNN) for timely landslide detection. Using features
learning to capture the intricate spatial and temporal patterns like vegetation index, temperature, and precipitation, the
inherent in meteorological data. This synergistic coupling aims research delves into model architecture, feature processing,
to improve the robustness and generalization of cloudburst performance evaluation, and future improvements[6].
prediction models. application of GAF and CNN in cloudburst
prediction not only holds promise for increased accuracy in Researchers have access to a dataset comprising 7,600+ disaster-
forecasting but also presents an opportunity to handle complex, news articles covering COVID-19, storms, floods, and natural
high-dimensional meteorological data more effectively. The disasters. Created datasets aid in sentence classification,
ability to recognize subtle patterns and relationships in summarization, and identifying event details, demonstrating
meteorological data, combined with the capacity to generate success with Random Forest classification[7]. Rising cloudburst
synthetic yet representative datasets, has the potential to occurrences in the Himalayas' southern region due to warmer
significantly enhance the precision and lead time in cloudburst climates require detailed investigation. Analyzing the 2012
predictions. This introduction sets the stage for exploring the Uttarkashi cloudburst using high-res IMDAA and ERA5
application of GAF and CNN in cloudburst prediction systems, datasets reveals IMDAA's superior representation of variables
highlighting the potential advancements in early warning and mechanisms[8].The study explored constructing a Splash
systems and disaster preparedness. The subsequent sections cloudburst-disaster model, relying on rainfall intensity,
will delve deeper into the methodologies, algorithms, and individual property values, and municipal gauge network data in

979-8-3503-7378-3/24/$31.00 ©2024 IEEE

Jönköping. Results suggested potential for simplified, reliable E. Deployment and Integration: Once a satisfactory
cloudburst catastrophe models in various urban contexts[9]. model is developed, integrate it into a system that can
Uttarakhand's Upper Ganga Basin faces flash flood risks from continuously receive and analyze real-time data to
extreme rainfall and cloudbursts. Analysis of 57 events predict the likelihood of a cloudburst.
identifies vulnerable areas, notably in the July-August monsoon, F. Monitoring and Updates: Continuous monitoring of
impacting densely populated regions[10].Kerala faced the model's performance is crucial. Additionally, the
devastating floods in 2018 and 2019. The 2019 event, displaying model may need periodic updates to adapt to changing
unusual convective nature fueled by warm sea temperatures, met weather patterns and improve prediction accuracy.
mesoscale cloudburst criteria, uncommon for the region. Such
events, influenced by global warming, may endanger Western G. Validation and Improvement: Periodically validate
the model's predictions against real occurrences of
Ghats ecosystems[11]. Climate-induced storms challenge urban
cloudbursts and continuously work on improving the
stormwater systems. Passive urban blue-green infrastructure
model's accuracy.
struggles during cloudbursts, causing downstream flood risks.
Research proposes adaptive real-time control (RTC) in It's important to note that the success of the project heavily
Tongzhou, Beijing, reducing peak outflow, proving scalable relies on the quality and quantity of data available for training,
urban sustainability solutions[12]. the choice of features, and the robustness of the chosen machine
learning model. Moreover, collaboration with domain experts in
III. ARCHITECTURE DIAGRAM meteorology can significantly enhance the accuracy and
relevance of the predictions.
The Convolutional Neural Network (CNN) is a type of deep
neural network optimized for processing structured grid-like
data, such as two-dimensional layouts typical in visual imagery.
This makes CNNs particularly effective for tasks involving
photographs, videos, and other visual media. By leveraging their
ability to detect and interpret spatial hierarchies in images—
recognizing patterns from simple to complex—CNNs excel in
areas where the identification of objects, scenes, and activities
in visual content is crucial. They consist of various layers,
including convolutional layers that filter inputs for useful
information, pooling layers that reduce data dimensionality, and
fully connected layers that classify the extracted features into
Fig.1. “Cloudburst Predicition” outputs. This structure enables CNNs to focus on specific
features without being overwhelmed by data size or complexity,
making them highly suitable for image recognition, video
IV. PROPOSED METHODOLOGY analysis, and other automated tasks that require a sophisticated
understanding of visual contexts.
A. Feature Selection and Engineering: Identify the most
relevant features that may contribute to cloudburst Explanation of CNN Algorithm:
prediction. Additionally, create new features derived
from existing ones that could improve the model's CNNs have transformed computer vision by empowering
performance. machines to interpret visual information, and they are equally
influential in fields like natural language processing and medical
B. Model Selection: Experiment with different machine image analysis where data often has a grid-like structure. These
learning models suitable for the prediction task. Models networks excel at automatically learning hierarchical
such as Random Forest, Gradient Boosting, Support representations, making them superb at extracting features and
Vector Machines (SVM), or neural networks might be recognizing patterns. The process involves several key layers:
considered. Ensemble methods or hybrid models can convolutional layers detect features by applying filters to the
also be tested for better accuracy. input; pooling layers reduce dimensionality to simplify the
information; and fully connected layers classify these features
C. Model Training :Split the data into training and testing
into meaningful outputs. This structured approach enables
sets. Train the selected models using the training data
CNNs to efficiently process and analyze complex datasets across
and validate their performance using the testing data.
various domains.Convolutional layers: These layers apply
Techniques like cross-validation may be applied to
convolution operations to input data using filters or kernels,
prevent overfitting.
scanning the input with these filters to extract specific features.
D. Hyperparameter Tuning: Fine-tune the models by
1. Pooling layers: Pooling, like max pooling or average
adjusting their hyperparameters to improve
pooling, reduces the dimensionality of the data,
performance. This process involves optimizing
preserving the most important information and
parameters that affect the model's learning.
reducing computational requirements

2. Activation functions: Typically, CNNs use activation In summary, These potential outputs will heavily rely on the
functions like ReLU (Rectified Linear Activation) to data available, the features extracted, and the model's ability
introduce non-linearity into the network, allowing it to to learn and predict cloudburst occurrences based on the
learn complex patterns. identified patterns and relationships within the input data.
3. Fully connected layers: At the end of the network,
fully connected layers combine the features extracted
by earlier layers to make predictions or classifications.
4. Output: The final output should ideally provide
actionable information for decision-makers or
stakeholders to take preventive measures or plan
responses in the case of an impending cloudburst.

Fig. 3. Tight layout

Fig. 2. Graphical representation Initially, the project showcases temperature and various
In the proposed system, the CNN algorithm can be weather conditions for a specific location, serving as a
employed for various purposes: foundational example of how the CNN algorithm can be
employed to predict cloudburst events. The displayed graph,
1. Probability or Confidence Score: The CNN could which leverages this algorithm, focuses on forecasting the
output a probability or confidence score indicating occurrence of cloudbursts in a particular area. Specifically, this
the likelihood of a cloudburst occurrence within a graph highlights the safety conditions of the Northern
certain timeframe. This could be a continuous value Himalayas, providing critical insights into potential severe
representing the estimated probability of a weather patterns. By utilizing CNN, the model processes and
cloudburst event. analyzes meteorological data to predict sudden, intense rainfall,
2. Binary Classification: The output might be a binary thus aiding in disaster preparedness and risk management for the
classification result indicating the presence or region.
absence of an imminent cloudburst. For instance, the
CNN could output a binary decision - "cloudburst
likely" or "cloudburst not likely" based on the input
data and the learned patterns.
3. Spatial or Temporal Prediction: Depending on the
design, the system might generate a spatial map
indicating regions with higher potential for a
cloudburst event. Alternatively, it could forecast the
temporal aspect, predicting the time window in
which a cloudburst is more likely to occur
4. Risk Assessment or Severity Level: The CNN
could output an assessment of the severity or risk
level associated with the potential cloudburst. It
could indicate the potential impact or scale of the
event based on learned features. Fig. 4..Celsius of given location
5. Visualization of Weather Patterns: The output
might be visual representations, such as heat maps
or other graphical representations, showing weather
patterns or atmospheric conditions that contribute to
a higher likelihood of a cloudburst.

understanding and predicting environmental and atmospheric
Predictive analysis for cloudburst events, which are
characterized by sudden, intense rainfalls that can lead to severe
flooding, is crucial for disaster readiness and mitigation. To
develop a predictive model utilizing Convolutional Neural
Networks (CNNs) and Gramian Angular Fields (GAF), one
must begin by gathering high-resolution meteorological data,
including rainfall intensity, cloud coverage, temperature, and
more. This data must then be prepared by transforming it into
GAF images—a technique that encodes time series data into
matrix formats using polar coordinates, which helps preserve the
Fig. 5 Air quality of given location correlation between different times through Gramian Angular
Summation Field (GASF) and relative timing of events through

Fig. 6 Moon phase for the given Location Fig. 8. No cloudburst Occurs

Following data preparation, a tailored CNN architecture is

developed, comprising multiple layers designed to extract
spatial hierarchies of features and reduce computational load.
This model is trained using historical, labeled data, optimizing
through techniques like backpropagation and employing metrics
such as accuracy and precision for evaluation. Once validated,
the model can be deployed in real-time systems to predict
cloudbursts, necessitating continuous monitoring and periodic
retraining with new data to maintain accuracy and adapt to
changing patterns. This approach not only leverages the CNN's
pattern recognition capabilities but also uses GAF's effective
transformation of time-series data, providing a robust
framework for predictive analysis in meteorological contexts.

Fig. 7 Humidity of given location

The project utilizes data such as Celsius temperature,
humidity, UV index, moon illumination, and wind speed to
generate a graph. This graph, created through the application of
the CNN algorithm, effectively illustrates the clustered
components of the weather data. The CNN’s capacity to analyze
and visualize complex datasets allows for a comprehensive
depiction of these variables in a coherent and interpretable
format. By organizing and clustering the data, the graph provides
valuable insights into the various factors that might influence
weather conditions in a specific area, serving as a tool for better Fig. 9. Cloudburst Occurs

TABLE I – EXPERIMENT RESULTS BASED ON 0.85 in F1 score, and for category 1, scores of 0.48 in precision,
F1-SCORE, PRECISION, RECALL, SUPPORT 0.55 in recall, and 0.51 in F1 score. Lastly, K-nearest Neighbors
recorded an accuracy of 75.53%, with precision, recall, and F1
Algorithm Value Precision Recall F1- support scores of 0.90, 0.77, and 0.83 for category 0, and 0.46, 0.71, and
score 0.56 for category 1.
CNN 0 0.89 0.94 0.93 22717
1 0.78 0.58 0.66 6375
Cat boost 0 0.88 0.95 0.91 22717 VIII. CONCLUSION
1 0.75 0.56 0.64 6375 The integration of these advanced techniques demonstrates the
Random 0 0.89 0.91 0.90 22717 potential for more accurate and spatially nuanced predictions.
forest 1 0.66 0.61 0.63 6375 The ability of GAF to represent complex data and CNN's
Logistic 0 0.92 0.77 0.84 22717 proficiency in recognizing patterns within this data contribute to
regression 1 0.48 0.76 0.59 6375 a robust predictive model. While showing promise, continued
K nearest 0 0.90 0.77 0.83 22717 research and development are crucial to refine these systems,
neighbour 1 0.46 0.71 0.56 6375 aiming for improved accuracy and reliability in cloudburst
XGB 0 0.88 0.94 0.91 22717 forecasting. The integration of GAF and CNN marks a
classifier 1 0.72 0.55 0.62 6375 significant step forward in enhancing our capability to anticipate
Decision 0 0.87 0.83 0.85 22717 and potentially mitigate the impact of these extreme weather
tree 1 0.48 0.55 0.51 6375 events.

TABLE II – ACCURACY OF DIFFERENT Implementing more sophisticated GAF-CNN models may

ALGORITHMS improve accuracy by capturing complex spatial patterns in
Accuracy of CNN 86.42 atmospheric data, potentially leading to more reliable and timely
Accuracy of catboost 86.18 cloudburst forecasts. Additionally, exploring ensemble methods
Accuracy of 84.45 or incorporating real-time data streams for more dynamic and
randomforest adaptable predictions could further enhance the system's
Accuracy of logistic 76.90 capabilities.
Accuracy of 75.53
Accuracy of decisiontree 76.97
