An Approach towards Increasing Prediction Accuracy for the Recovery of Missing IoT Data based on the GRNN-SGTM Ensemble †
Abstract
:1. Introduction
2. Related Works
- Lack of training procedures;
- The need to configure a single neural network parameter;
- Generalization properties are the highest among the known neural networks.
- Like any neural network, GRNN has a number of disadvantages including the following:
- Relatively low accuracy;
- Certain time delays in the application mode;
- No extrapolative properties.
- Based on the topology of two sequentially connected GRNN networks and an SGTM neural-like structure, a new ensemble method for solving prediction problem is devised; the introduction of the latter into the ensemble improves the accuracy of the prediction results by replacing the summation of the outcome of the two GRNNs with weighted summation with displacement;
- The optimal operation parameters of the developed ensemble are selected by means of optimization, which provide the highest accuracy in solving the task;
- The effectiveness of applying the developed ensemble is substantiated by a comparison between its outcomes and the latest existing developments dealing with solving the problem of completing the missing data in a real sample collected by an IoT device.
3. Materials and Methods
3.1. Fundamental Statements of GRNN
- Search for Euclidean distances from the input vector with components to available vectors with known output values that are considered to be support ones [34]:
- Calculating the desired value according to a calculation formula of the GRNN method [34]:
3.2. Components of GRNN Output Generation Error
3.3. GRNN Ensemble Using Two ANNs
3.4. Linear SGTM Neural-Like Structure
3.5. Proposed GRNN-SGTM Ensemble
4. Modeling and Results
4.1. Data Descriptions
4.2. Performance Evaluation Indicators
- Root Mean Squared Error (RMSE):
- Mean Absolute Percentage Error (MAPE):
4.3. Choice of Optimal Parameters of Ensemble
5. Comparison and Discussion
6. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Cuka, M.; Elmazi, D.; Matsuo, K.; Ikeda, M.; Barolli, L.; Takizawa, M. IoT Device Selection in Opportunistic Networks: A Fuzzy Approach Considering IoT Device Failure Rate. In Proceedings of the Advances in Internet, Data and Web Technologies; Barolli, L., Xhafa, F., Khan, Z.A., Odhabi, H., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 39–52. [Google Scholar]
- Casado-Vara, R.; Prieto-Castrillo, F.; Corchado, J.M. A game theory approach for cooperative control to improve data quality and false data detection in WSN. Int. J. Robust Nonlinear Control 2018, 28, 5087–5102. [Google Scholar] [CrossRef]
- Mary, I.P.S.; Arockiam, L. Imputing the missing data in IoT based on the spatial and temporal correlation. In Proceedings of the 2017 IEEE International Conference on Current Trends in Advanced Computing (ICCTAC), Bangalore, India, 2–3 March 2017; pp. 1–4. [Google Scholar]
- Yan, X.; Xiong, W.; Hu, L.; Wang, F.; Zhao, K. Missing Value Imputation Based on Gaussian Mixture Model for the Internet of Things. Available online: https://www.hindawi.com/journals/mpe/2015/548605/ (accessed on 21 March 2020).
- Balakrishnan, S.M.; Sangaiah, A.K. Chapter 6—Aspect Oriented Modeling of Missing Data Imputation for Internet of Things (IoT) Based Healthcare Infrastructure. In Computational Intelligence for Multimedia Big Data on the Cloud with Engineering Applications; Sangaiah, A.K., Sheng, M., Zhang, Z., Eds.; Intelligent Data-Centric Systems; Academic Press: Cambridge, MA, USA, 2018; pp. 135–145. ISBN 978-0-12-813314-9. [Google Scholar]
- Mary, I.P.S. Imputing the missing values in IoT using ESTCP model. Int. J. Adv. Res. Comput. Sci. 2017, 8. [Google Scholar] [CrossRef]
- Azimi, I.; Pahikkala, T.; Rahmani, A.M.; Niela-Vilén, H.; Axelin, A.; Liljeberg, P. Missing data resilient decision-making for healthcare IoT through personalization: A case study on maternal health. Future Gener. Comput. Syst. 2019, 96, 297–308. [Google Scholar] [CrossRef]
- IoT Analytics Challenges—Analytics for the Internet of Things (IoT); Packt Publishing Ltd.: Birmingham, UK, 2017; ISBN 978-1-78712-073-0.
- Lujic, I.; Maio, V.D.; Brandic, I. Adaptive Recovery of Incomplete Datasets for Edge Analytics. In Proceedings of the 2018 IEEE 2nd International Conference on Fog and Edge Computing (ICFEC), Washington, DC, USA, 1–3 May 2018; pp. 1–10. [Google Scholar]
- Lee, M.; An, J.; Lee, Y. Missing-Value Imputation of Continuous Missing Based on Deep Imputation Network Using Correlations among Multiple IoT Data Streams in a Smart Space. Ieice Trans. Inf. Syst. 2019, 102, 289–298. [Google Scholar] [CrossRef] [Green Version]
- Ding, Z.; Mei, G.; Cuomo, S.; Li, Y.; Xu, N. Comparison of Estimating Missing Values in IoT Time Series Data Using Different Interpolation Algorithms. Int. J. Parallel. Prog. 2018, 1–15. [Google Scholar] [CrossRef]
- Aishwarya, G.; Latha, V. Data Recovery by Fountain Codes in IoT Networks. Int. J. Appl. Eng. Res. 2018, 13, 10419–10423. [Google Scholar]
- Marcelis, P.J.; Rao, V.S.; Prasad, R.V. DaRe: Data Recovery through Application Layer Coding for LoRaWAN. In Proceedings of the 2017 IEEE/ACM Second International Conference on Internet-of-Things Design and Implementation (IoTDI), Pittsburgh, PA, USA, 18–21 April 2017; pp. 97–108. [Google Scholar]
- Zhou, J.; Huang, Z. Recover Missing Sensor Data with Iterative Imputing Network. Available online: https://www.semanticscholar.org/paper/Recover-Missing-Sensor-Data-with-Iterative-Imputing-Zhou-Huang/59813bfb77cda27c2c510c2d5b3bbf23f105a293 (accessed on 31 March 2020).
- Abu-Elkheir, M.; Hayajneh, M.; Ali, N.A. Data Management for the Internet of Things: Design Primitives and Solution. Sensors 2013, 13, 15582–15612. [Google Scholar] [CrossRef] [Green Version]
- Guzel, M.; Kok, I.; Akay, D.; Ozdemir, S. ANFIS and Deep Learning based missing sensor data prediction in IoT. Concurr. Comput. Pract. Exp. 2020, 32. [Google Scholar] [CrossRef]
- Babichev, S. An Evaluation of the Information Technology of Gene Expression Profiles Processing Stability for Different Levels of Noise Components. Data 2018, 3, 48. [Google Scholar] [CrossRef] [Green Version]
- Djeziri, M.A.; Benmoussa, S.; Benbouzid, M.E.H. Data-driven approach augmented in simulation for robust fault prognosis. Eng. Appl. Artif. Intell. 2019, 86, 154–164. [Google Scholar] [CrossRef]
- Syerov, Y.; Shakhovska, N.; Fedushko, S. Method of the Data Adequacy Determination of Personal Medical Profiles. In Proceedings of the Advances in Intelligent Systems and Computing II; Hu, Z., Petoukhov, S.V., He, M., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 333–343. [Google Scholar]
- Korobiichuk, I.; Fedushko, S.; Juś, A.; Syerov, Y. Methods of Determining Information Support of Web Community User Personal Data Verification System. In Proceedings of the Automation 2017; Szewczyk, R., Zieliński, C., Kaliczyńska, M., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 144–150. [Google Scholar]
- Sharath, S.E.; Zamani, N.; Kougias, P.; Kim, S. Missing Data in Surgical Data Sets: A Review of Pertinent Issues and Solutions. J. Surg. Res. 2018, 232, 240–246. [Google Scholar] [CrossRef] [PubMed]
- Ma, S.; Schreiner, P.J.; Seaquist, E.R.; Ugurbil, M.; Zmora, R.; Chow, L.S. Multiple predictively equivalent risk models for handling missing data at time of prediction: With an application in severe hypoglycemia risk prediction for type 2 diabetes. J. Biomed. Inform. 2020, 103, 103379. [Google Scholar] [CrossRef] [PubMed]
- Beretta, L.; Santaniello, A. Nearest neighbor imputation algorithms: A critical evaluation. BMC Med. Inform. Decis. Mak. 2016, 16. [Google Scholar] [CrossRef] [Green Version]
- Jonsson, P.; Wohlin, C. An evaluation of k-nearest neighbour imputation using Likert data. In Proceedings of the 10th International Symposium on Software Metrics, Chicago, IL, USA, 11–17 September 2004; pp. 108–118. [Google Scholar]
- Jadhav, A.; Pramod, D.; Ramanathan, K. Comparison of Performance of Data Imputation Methods for Numeric Dataset. Appl. Artif. Intell. 2019, 33, 913–933. [Google Scholar] [CrossRef]
- Lee, J.Y.; Styczynski, M.P. NS-kNN: A modified k-nearest neighbors approach for imputing metabolomics data. Metab. Off. J. Metab. Soc. 2018, 14, 153. [Google Scholar] [CrossRef] [PubMed]
- Mary, I.P.S. Imputing the Missing Values in IoT using FRBIM. IJRTE 2019, 8, 3375–3380. [Google Scholar] [CrossRef]
- Lai, X.; Liu, X.; Zhang, L.; Lin, C.; Obaidat, M.S.; Hsiao, K.-F. Missing Value Imputations by Rule-Based Incomplete Data Fuzzy Modeling. In Proceedings of the ICC 2019—2019 IEEE International Conference on Communications (ICC), Shanghai, China, 20–24 May 2019; pp. 1–6. [Google Scholar]
- Luengo, J.; Sáez, J.A.; Herrera, F. Missing data imputation for fuzzy rule-based classification systems. Soft Comput. 2012, 16, 863–881. [Google Scholar] [CrossRef]
- Mishchuk, O.; Tkachenko, R.; Izonin, I. Missing Data Imputation Through SGTM Neural-Like Structure for Environmental Monitoring Tasks. In Proceedings of the Advances in Computer Science for Engineering and Education II; Hu, Z., Petoukhov, S., Dychka, I., He, M., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 142–151. [Google Scholar]
- Tkachenko, R.; Izonin, I. Model and Principles for the Implementation of Neural-Like Structures Based on Geometric Data Transformations. In Advances in Computer Science for Engineering and Education; Hu, Z., Petoukhov, S., Dychka, I., He, M., Eds.; Springer International Publishing: Cham, Switzerland, 2019; Volume 754, pp. 578–587. ISBN 978-3-319-91007-9. [Google Scholar]
- Izonin, I.; Tkachenko, R.; Kryvinska, N.; Zub, K.; Mishchuk, O.; Lisovych, T. Recovery of Incomplete IoT Sensed Data using High-Performance Extended-Input Neural-Like Structure. Procedia Comput. Sci. 2019, 160, 521–526. [Google Scholar] [CrossRef]
- Ivakhnenko, A.G. Polynomial Theory of Complex Systems. IEEE Trans. Syst. Manand Cybern. 1971, SMC-1, 364–378. [Google Scholar] [CrossRef] [Green Version]
- Izonin, I.; Kryvinska, N.; Vitynskyi, P.; Tkachenko, R.; Zub, K. GRNN Approach Towards Missing Data Recovery Between IoT Systems. In Proceedings of the Advances in Intelligent Networking and Collaborative Systems; Barolli, L., Nishino, H., Miwa, H., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 445–453. [Google Scholar]
- Song, J.; Romero, C.E.; Yao, Z.; He, B. A globally enhanced general regression neural network for on-line multiple emissions prediction of utility boiler. Knowl. Based Syst. 2017, 118, 4–14. [Google Scholar] [CrossRef]
- Izonin, I.; Kryvinska, N.; Tkachenko, R.; Zub, K.; Vitynskyi, P. An Extended-Input GRNN and its Application. Procedia Comput. Sci. 2019, 160, 578–583. [Google Scholar] [CrossRef]
- Alomair, O.A.; Garrouch, A.A. A general regression neural network model offers reliable prediction of CO2 minimum miscibility pressure. J. Pet. Explor. Prod. Technol. 2016, 6, 351–365. [Google Scholar] [CrossRef] [Green Version]
- Vagelis, P. Structural Seismic Design Optimization and Earthquake Engineering: Formulations and Applications: Formulations and Applications; IGI Global: Hershey, PA, USA, 2012; ISBN 978-1-4666-1641-7. [Google Scholar]
- Huang, D.-S.; Irwin, G.W. Intelligent Computing in Signal Processing and Pattern Recognition: International Conference on Intelligent Computing, ICIC 2006, Kunming, China, August, 2006; Springer: New York, NY, USA, 2006; ISBN 978-3-540-37258-5. [Google Scholar]
- Bodyanskiy, Y.V.; Deineko, A.O.; Kutsenko, Y.V. On-line kernel clustering based on the general regression neural network and T. Kohonen’s self-organizing map. Aut. Control Comp. Sci. 2017, 51, 55–62. [Google Scholar] [CrossRef]
- Duda, P.; Jaworski, M.; Rutkowski, L. Online GRNN-Based Ensembles for Regression on Evolving Data Streams. In Proceedings of the Advances in Neural Networks—ISNN 2018; Huang, T., Lv, J., Sun, C., Tuzikov, A.V., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 221–228. [Google Scholar]
- Zhou, J.; Peng, T.; Zhang, C.; Sun, N. Data Pre-Analysis and Ensemble of Various Artificial Neural Networks for Monthly Streamflow Forecasting. Water 2018, 10, 628. [Google Scholar] [CrossRef] [Green Version]
- Vitynskiy, P.B.; Tkachenko, R.O.; Izonin, I.V. Ансамбль мереж GRNN для рoзв’язання задач регресії з підвищенoю тoчністю. Наукoвий вісник НЛТУ України 2019, 29, 120–124. [Google Scholar] [CrossRef] [Green Version]
- Specht, D.F. A general regression neural network. IEEE Trans. Neural Netw. 1991, 2, 568–576. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Dronyuk, I.; Fedevych, O.; Poplavska, Z. The generalized shift operator and non-harmonic signal analysis. In Proceedings of the 2017 14th International Conference The Experience of Designing and Application of CAD Systems in Microelectronics (CADSM), Lviv, Ukraine, 21–25 February 2017; pp. 89–91. [Google Scholar]
- Nazarkevych, M.; Lotoshynska, N.; Klyujnyk, I.; Voznyi, Y.; Forostyna, S.; Maslanych, I. Complexity Evaluation of the Ateb-Gabor Filtration Algorithm in Biometric Security Systems. In Proceedings of the 2019 IEEE 2nd Ukraine Conference on Electrical and Computer Engineering (UKRCON), Lviv, Ukraine, 2–6 July 2019; pp. 961–964. [Google Scholar]
- De Vito, S.; Massera, E.; Piga, M.; Martinotto, L.; Di Francia, G. On field calibration of an electronic nose for benzene estimation in an urban pollution monitoring scenario. Sens. Actuators B Chem. 2008, 129, 750–757. [Google Scholar] [CrossRef]
- Kotsovsky, V.; Geche, F.; Batyuk, A. On the Computational Complexity of Learning Bithreshold Neural Units and Networks. In Proceedings of the Lecture Notes in Computational Intelligence and Decision Making; Springer: Cham, Switzerland, 2019; pp. 189–202. [Google Scholar]
- Teslyuk, T.; Tsmots, I.; Teslyuk, V.; Medykovskyy, M.; Opotyak, Y. Architecture and Models for System-Level Computer-Aided Design of the Management System of Energy Efficiency of Technological Processes at the Enterprise. In Proceedings of the Advances in Intelligent Systems and Computing II; Springer: Cham, Switzerland, 2017; pp. 538–557. [Google Scholar]
Reasons | Investigations |
---|---|
the unstable network communication, synchronization problems, unreliable sensor devices, environmental factors, and other device malfunctions; | [3,4,5,6] |
the interruption of the data acquisition in long-term monitoring scenarios; | [7] |
the location, firmware may not be consistent across locations. This could mean differences in reporting frequency or formatting of values; | [8] |
the sensor failures, monitoring system failures or network failures; | [9] |
the storage errors, unreliable IoT devices, unstable network status; | [10] |
the incorrect response or nonresponse of the IoT-based sensors; | [11] |
the collision of the nodes when the information passes from sender to receiver; | [12] |
the channel effects and mobility of the end-devices; | [13] |
the errors in data collection and transmission; | [14] |
the data integration from different sources into a unified schema; | [15] |
the lack of battery power, communication errors, and malfunctioning devices. | [16] |
Variable | MEAN Value | MAX Value | MIN Value | Chemical Nomenclature |
---|---|---|---|---|
Tungsten monoxide | 817.0748 | 2683 | 322 | WO |
Tungsten dioxide | 1452.494 | 2775 | 551 | WO2 |
Titanium | 958.2302 | 2214 | 390 | Ti |
Temperature | 17.75942 | 44.6 | 0.1 | T |
Relative humidity | 48.90163 | 88.7 | 9.2 | RH |
Non-methane hydrocarbons | 1119.626 | 2040 | 647 | SnO2 |
Nitrogen monoxide | 250.465 | 1479 | 2 | NO |
Nitrogen dioxide | 113.7894 | 333 | 2 | NO2 |
Indium oxide | 1057.363 | 2523 | 221 | InO |
Carbon monoxide | 2.19059 | 11.9 | 0.1 | CO |
Benzene | 10.54635 | 63.7 | 0.2 | C6H6 |
Absolute humidity | 0.986315 | 2.2345 | 0.1847 | AH |
MAPE, % | RMSE | ||
---|---|---|---|
0.23 | 0.05 | 20.268 (train mode) | 0.493 (train mode) |
18.828 (test mode) | 0.458 (test mode) |
Method | Parameters | RMSE | MAPE, % |
---|---|---|---|
GRNN [34] | input neurons = 11, . | 0.464 | 19.856 |
Extended-inputs GRNN [36] | input neurons = 78, . | 0.549 | 19.905 |
SGTM neural-like structure (test mode) [30] | input neurons = 11, hidden neurons = 11 (1 hidden layer). | 0.497 | 20.491 |
Extended-input SGTM neural-like structure (test mode) [32] | input neurons = 78, hidden neurons = 40 (1 hidden layer). | 0.458 | 19.911 |
GRNN-SGTM ensemble (test mode) | parameters are given above in the text | 0.458 | 18.828 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tkachenko, R.; Izonin, I.; Kryvinska, N.; Dronyuk, I.; Zub, K. An Approach towards Increasing Prediction Accuracy for the Recovery of Missing IoT Data based on the GRNN-SGTM Ensemble. Sensors 2020, 20, 2625. https://doi.org/10.3390/s20092625
Tkachenko R, Izonin I, Kryvinska N, Dronyuk I, Zub K. An Approach towards Increasing Prediction Accuracy for the Recovery of Missing IoT Data based on the GRNN-SGTM Ensemble. Sensors. 2020; 20(9):2625. https://doi.org/10.3390/s20092625
Chicago/Turabian StyleTkachenko, Roman, Ivan Izonin, Natalia Kryvinska, Ivanna Dronyuk, and Khrystyna Zub. 2020. "An Approach towards Increasing Prediction Accuracy for the Recovery of Missing IoT Data based on the GRNN-SGTM Ensemble" Sensors 20, no. 9: 2625. https://doi.org/10.3390/s20092625
APA StyleTkachenko, R., Izonin, I., Kryvinska, N., Dronyuk, I., & Zub, K. (2020). An Approach towards Increasing Prediction Accuracy for the Recovery of Missing IoT Data based on the GRNN-SGTM Ensemble. Sensors, 20(9), 2625. https://doi.org/10.3390/s20092625