Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (8)

Search Parameters:
Keywords = selective surrogate ensembles

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 16930 KiB  
Article
A Forest Fire Prediction Model Based on Meteorological Factors and the Multi-Model Ensemble Method
by Seungcheol Choi, Minwoo Son, Changgyun Kim and Byungsik Kim
Forests 2024, 15(11), 1981; https://doi.org/10.3390/f15111981 - 9 Nov 2024
Viewed by 533
Abstract
More than half of South Korea’s land area is covered by forests, which significantly increases the potential for extensive damage in the event of a forest fire. The majority of forest fires in South Korea are caused by humans. Over the past decade, [...] Read more.
More than half of South Korea’s land area is covered by forests, which significantly increases the potential for extensive damage in the event of a forest fire. The majority of forest fires in South Korea are caused by humans. Over the past decade, more than half of these types of fires occurred during the spring season. Although human activities are the primary cause of forest fires, the fact that they are concentrated in the spring underscores the strong association between forest fires and meteorological factors. When meteorological conditions favor the occurrence of forest fires, certain triggering factors can lead to their ignition more easily. The purpose of this study is to analyze the meteorological factors influencing forest fires and to develop a machine learning-based prediction model for forest fire occurrence, focusing on meteorological data. The study focuses on four regions within Gangwon province in South Korea, which have experienced substantial damage from forest fires. To construct the model, historical meteorological data were collected, surrogate variables were calculated, and a variable selection process was applied to identify relevant meteorological factors. Five machine learning models were then used to predict forest fire occurrence and ensemble techniques were employed to enhance the model’s performance. The performance of the developed forest fire prediction model was evaluated using evaluation metrics. The results indicate that the ensemble model outperformed the individual models, with a higher F1-score and a notable reduction in false positives compared to the individual models. This suggests that the model developed in this study, when combined with meteorological forecast data, can potentially predict forest fire occurrence and provide insights into the expected severity of fires. This information could support decision-making for forest fire management, aiding in the development of more effective fire response plans. Full article
(This article belongs to the Special Issue Forest Fires Prediction and Detection—2nd Edition)
Show Figures

Figure 1

13 pages, 3660 KiB  
Article
A Novel Surrogate-Assisted Multi-Objective Well Control Parameter Optimization Method Based on Selective Ensembles
by Lian Wang, Rui Deng, Liang Zhang, Jianhua Qu, Hehua Wang, Liehui Zhang, Xing Zhao, Bing Xu, Xindong Lv and Caspar Daniel Adenutsi
Processes 2024, 12(10), 2140; https://doi.org/10.3390/pr12102140 - 1 Oct 2024
Viewed by 684
Abstract
Multi-objective optimization algorithms are crucial for addressing real-world problems, particularly with regard to optimizing well control parameters, which are often computationally expensive due to their reliance on numerical simulations. Surrogate-assisted models help to reduce this computational burden, but their effectiveness depends on the [...] Read more.
Multi-objective optimization algorithms are crucial for addressing real-world problems, particularly with regard to optimizing well control parameters, which are often computationally expensive due to their reliance on numerical simulations. Surrogate-assisted models help to reduce this computational burden, but their effectiveness depends on the quality of the surrogates, which can be affected by candidate dimension and noise. This study proposes a novel surrogate-assisted multi-objective optimization framework (MOO-SESA) that combines selective ensemble support-vector regression with NSGA-II. The framework’s uniqueness lies in its adaptive selection of a diverse subset of surrogates, established prior to iteration, to enhance accuracy, robustness, and computational efficiency. To our knowledge, this is the first instance in which selective ensemble techniques with multi-objective optimization have been applied to reservoir well control problems. Through employing an ensemble strategy for improving the quality of the surrogate model, MOO-SESA demonstrated superior well control scenarios and faster convergence compared to traditional surrogate-assisted models when applied to the SPE10 and Egg reservoir models. Full article
(This article belongs to the Special Issue Advances in Enhancing Unconventional Oil/Gas Recovery, 2nd Edition)
Show Figures

Figure 1

20 pages, 564 KiB  
Article
Unsupervised Insurance Fraud Prediction Based on Anomaly Detector Ensembles
by Alexander Vosseler
Risks 2022, 10(7), 132; https://doi.org/10.3390/risks10070132 - 21 Jun 2022
Cited by 5 | Viewed by 3640
Abstract
The detection of anomalous data patterns is one of the most prominent machine learning use cases in industrial applications. Unfortunately very often there are no ground truth labels available and therefore it is good practice to combine different unsupervised base learners with the [...] Read more.
The detection of anomalous data patterns is one of the most prominent machine learning use cases in industrial applications. Unfortunately very often there are no ground truth labels available and therefore it is good practice to combine different unsupervised base learners with the hope to improve the overall predictive quality. Here one of the challenges is to combine base learners that are accurate and divers at the same time, where another challenge is to enable model explainability. In this paper we present BHAD, a fast unsupervised Bayesian histogram anomaly detector, which scales linearly with the sample size and the number of attributes and is shown to have very competitive accuracy compared to other analyzed anomaly detectors. For the problem of model explainability in unsupervised outlier ensembles we introduce a generic model explanation approach using a supervised surrogate model. For the problem of ensemble construction we propose a greedy model selection approach using the mutual information of two score distributions as a similarity measure. Finally we give a detailed description of a real fraud detection application from the corporate insurance domain using an outlier ensemble, we share various feature engineering ideas as well as discuss practical challenges. Full article
Show Figures

Figure 1

22 pages, 3783 KiB  
Article
Interpretable Machine Learning Models for Malicious Domains Detection Using Explainable Artificial Intelligence (XAI)
by Nida Aslam, Irfan Ullah Khan, Samiha Mirza, Alanoud AlOwayed, Fatima M. Anis, Reef M. Aljuaid and Reham Baageel
Sustainability 2022, 14(12), 7375; https://doi.org/10.3390/su14127375 - 16 Jun 2022
Cited by 31 | Viewed by 5140
Abstract
With the expansion of the internet, a major threat has emerged involving the spread of malicious domains intended by attackers to perform illegal activities aiming to target governments, violating privacy of organizations, and even manipulating everyday users. Therefore, detecting these harmful domains is [...] Read more.
With the expansion of the internet, a major threat has emerged involving the spread of malicious domains intended by attackers to perform illegal activities aiming to target governments, violating privacy of organizations, and even manipulating everyday users. Therefore, detecting these harmful domains is necessary to combat the growing network attacks. Machine Learning (ML) models have shown significant outcomes towards the detection of malicious domains. However, the “black box” nature of the complex ML models obstructs their wide-ranging acceptance in some of the fields. The emergence of Explainable Artificial Intelligence (XAI) has successfully incorporated the interpretability and explicability in the complex models. Furthermore, the post hoc XAI model has enabled the interpretability without affecting the performance of the models. This study aimed to propose an Explainable Artificial Intelligence (XAI) model to detect malicious domains on a recent dataset containing 45,000 samples of malicious and non-malicious domains. In the current study, initially several interpretable ML models, such as Decision Tree (DT) and Naïve Bayes (NB), and black box ensemble models, such as Random Forest (RF), Extreme Gradient Boosting (XGB), AdaBoost (AB), and Cat Boost (CB) algorithms, were implemented and found that XGB outperformed the other classifiers. Furthermore, the post hoc XAI global surrogate model (Shapley additive explanations) and local surrogate LIME were used to generate the explanation of the XGB prediction. Two sets of experiments were performed; initially the model was executed using a preprocessed dataset and later with selected features using the Sequential Forward Feature selection algorithm. The results demonstrate that ML algorithms were able to distinguish benign and malicious domains with overall accuracy ranging from 0.8479 to 0.9856. The ensemble classifier XGB achieved the highest result, with an AUC and accuracy of 0.9991 and 0.9856, respectively, before the feature selection algorithm, while there was an AUC of 0.999 and accuracy of 0.9818 after the feature selection algorithm. The proposed model outperformed the benchmark study. Full article
Show Figures

Figure 1

27 pages, 5765 KiB  
Article
Remote Sensing and Meteorological Data Fusion in Predicting Bushfire Severity: A Case Study from Victoria, Australia
by Saroj Kumar Sharma, Jagannath Aryal and Abbas Rajabifard
Remote Sens. 2022, 14(7), 1645; https://doi.org/10.3390/rs14071645 - 29 Mar 2022
Cited by 11 | Viewed by 5223
Abstract
The extent and severity of bushfires in a landscape are largely governed by meteorological conditions. An accurate understanding of the interactions of meteorological variables and fire behaviour in the landscape is very complex, yet possible. In exploring such understanding, we used 2693 high-confidence [...] Read more.
The extent and severity of bushfires in a landscape are largely governed by meteorological conditions. An accurate understanding of the interactions of meteorological variables and fire behaviour in the landscape is very complex, yet possible. In exploring such understanding, we used 2693 high-confidence active fire points recorded by a Moderate Resolution Imaging Spectroradiometer (MODIS) sensor for nine different bushfires that occurred in Victoria between 1 January 2009 and 31 March 2009. These fires include the Black Saturday Bushfires of 7 February 2009, one of the worst bushfires in Australian history. For each fire point, 62 different meteorological parameters of bushfire time were extracted from Bureau of Meteorology Atmospheric high-resolution Regional Reanalysis for Australia (BARRA) data. These remote sensing and meteorological datasets were fused and further processed in assessing their relative importance using four different tree-based ensemble machine learning models, namely, Random Forest (RF), Fuzzy Forest (FF), Boosted Regression Tree (BRT), and Extreme Gradient Boosting (XGBoost). Google Earth Engine (GEE) and Landsat images were used in deriving the response variable–Relative Difference Normalised Burn Ratio (RdNBR), which was selected by comparing its performance against Difference Normalised Burn Ratio (dNBR). Our findings demonstrate that the FF algorithm utilising the Weighted Gene Coexpression Network Analysis (WGCNA) method has the best predictive performance of 96.50%, assessed against 10-fold cross-validation. The result shows that the relative influence of the variables on bushfire severity is in the following order: (1) soil moisture, (2) soil temperature, (3) air pressure, (4) air temperature, (5) vertical wind, and (6) relative humidity. This highlights the importance of soil meteorology in bushfire severity analysis, often excluded in bushfire severity research. Further, this study provides a scientific basis for choosing a subset of meteorological variables for bushfire severity prediction depending on their relative importance. The optimal subset of high-ranked variables is extremely useful in constructing simplified and computationally efficient surrogate models, which can be particularly useful for the rapid assessment of bushfire severity for operational bushfire management and effective mitigation efforts. Full article
(This article belongs to the Section Environmental Remote Sensing)
Show Figures

Graphical abstract

47 pages, 8936 KiB  
Review
A Powerful Paradigm for Cardiovascular Risk Stratification Using Multiclass, Multi-Label, and Ensemble-Based Machine Learning Paradigms: A Narrative Review
by Jasjit S. Suri, Mrinalini Bhagawati, Sudip Paul, Athanasios D. Protogerou, Petros P. Sfikakis, George D. Kitas, Narendra N. Khanna, Zoltan Ruzsa, Aditya M. Sharma, Sanjay Saxena, Gavino Faa, John R. Laird, Amer M. Johri, Manudeep K. Kalra, Kosmas I. Paraskevas and Luca Saba
Diagnostics 2022, 12(3), 722; https://doi.org/10.3390/diagnostics12030722 - 16 Mar 2022
Cited by 34 | Viewed by 4612
Abstract
Background and Motivation: Cardiovascular disease (CVD) causes the highest mortality globally. With escalating healthcare costs, early non-invasive CVD risk assessment is vital. Conventional methods have shown poor performance compared to more recent and fast-evolving Artificial Intelligence (AI) methods. The proposed study reviews the [...] Read more.
Background and Motivation: Cardiovascular disease (CVD) causes the highest mortality globally. With escalating healthcare costs, early non-invasive CVD risk assessment is vital. Conventional methods have shown poor performance compared to more recent and fast-evolving Artificial Intelligence (AI) methods. The proposed study reviews the three most recent paradigms for CVD risk assessment, namely multiclass, multi-label, and ensemble-based methods in (i) office-based and (ii) stress-test laboratories. Methods: A total of 265 CVD-based studies were selected using the preferred reporting items for systematic reviews and meta-analyses (PRISMA) model. Due to its popularity and recent development, the study analyzed the above three paradigms using machine learning (ML) frameworks. We review comprehensively these three methods using attributes, such as architecture, applications, pro-and-cons, scientific validation, clinical evaluation, and AI risk-of-bias (RoB) in the CVD framework. These ML techniques were then extended under mobile and cloud-based infrastructure. Findings: Most popular biomarkers used were office-based, laboratory-based, image-based phenotypes, and medication usage. Surrogate carotid scanning for coronary artery risk prediction had shown promising results. Ground truth (GT) selection for AI-based training along with scientific and clinical validation is very important for CVD stratification to avoid RoB. It was observed that the most popular classification paradigm is multiclass followed by the ensemble, and multi-label. The use of deep learning techniques in CVD risk stratification is in a very early stage of development. Mobile and cloud-based AI technologies are more likely to be the future. Conclusions: AI-based methods for CVD risk assessment are most promising and successful. Choice of GT is most vital in AI-based models to prevent the RoB. The amalgamation of image-based strategies with conventional risk factors provides the highest stability when using the three CVD paradigms in non-cloud and cloud-based frameworks. Full article
(This article belongs to the Special Issue Lesion Detection and Analysis Using Artificial Intelligence)
Show Figures

Figure 1

19 pages, 2195 KiB  
Article
Selecting the Best Quantity and Variety of Surrogates for an Ensemble Model
by Pengcheng Ye and Guang Pan
Mathematics 2020, 8(10), 1721; https://doi.org/10.3390/math8101721 - 7 Oct 2020
Cited by 6 | Viewed by 1935
Abstract
Surrogate modeling techniques are widely used to replace the computationally expensive black-box functions in engineering. As a combination of individual surrogate models, an ensemble of surrogates is preferred due to its strong robustness. However, how to select the best quantity and variety of [...] Read more.
Surrogate modeling techniques are widely used to replace the computationally expensive black-box functions in engineering. As a combination of individual surrogate models, an ensemble of surrogates is preferred due to its strong robustness. However, how to select the best quantity and variety of surrogates for an ensemble has always been a challenging task. In this work, five popular surrogate modeling techniques including polynomial response surface (PRS), radial basis functions (RBF), kriging (KRG), Gaussian process (GP) and linear shepard (SHEP) are considered as the basic surrogate models, resulting in twenty-six ensemble models by using a previously presented weights selection method. The best ensemble model is expected to be found by comparative studies on prediction accuracy and robustness. By testing eight mathematical problems and two engineering examples, we found that: (1) in general, using as many accurate surrogates as possible to construct ensemble models will improve the prediction performance and (2) ensemble models can be used as an insurance rather than offering significant improvements. Moreover, the ensemble of three surrogates PRS, RBF and KRG is preferred based on the prediction performance. The results provide engineering practitioners with guidance on the superior choice of the quantity and variety of surrogates for an ensemble. Full article
(This article belongs to the Section Mathematics and Computer Science)
Show Figures

Figure 1

18 pages, 1701 KiB  
Article
An Effective Surrogate Ensemble Modeling Method for Satellite Coverage Traffic Volume Prediction
by Siyu Ye, Yi Zhang, Wen Yao, Quan Chen and Xiaoqian Chen
Appl. Sci. 2019, 9(18), 3689; https://doi.org/10.3390/app9183689 - 5 Sep 2019
Cited by 5 | Viewed by 2336
Abstract
The satellite constellation network is a powerful tool to provide ground traffic business services for continuous global coverage. For the resource-limited satellite network, it is necessary to predict satellite coverage traffic volume (SCTV) in advance to properly allocate onboard resources for better task [...] Read more.
The satellite constellation network is a powerful tool to provide ground traffic business services for continuous global coverage. For the resource-limited satellite network, it is necessary to predict satellite coverage traffic volume (SCTV) in advance to properly allocate onboard resources for better task fulfillment. Traditionally, a global SCTV distribution data table is first statistically constructed on the ground according to historical data and uploaded to the satellite. Then SCTV is predicted onboard by a data table lookup. However, the cost of the large data transmission and storage is expensive and prohibitive for satellites. To solve these problems, this paper proposes to distill the data into a surrogate model to be uploaded to the satellite, which can both save the valuable communication link resource and improve the SCTV prediction accuracy compared to the table lookup. An effective surrogate ensemble modeling method is proposed in this paper for better prediction. First, according to prior geographical knowledge of the SCTV distribution, the global earth surface domain is split into multiple sub-domains. Second, on each sub-domain, multiple candidate surrogates are built. To fully exploit these surrogates and combine them into a more accurate ensemble, a partial weighted aggregation method (PWTA) is developed. For each sub-domain, PWTA adaptively selects the candidate surrogates with higher accuracy as the contributing models, based on which the ultimate ensemble is constructed for each sub-domain SCTV prediction. The proposed method is demonstrated and testified with an air traffic SCTV engineering problem. The results demonstrate the effectiveness of PWTA regarding good local and global prediction accuracy and modeling robustness. Full article
Show Figures

Figure 1

Back to TopTop