Machine Learning and Big Data in the Impact Literature. A Bibliometric Review with Scientific Mapping in Web of Science
Abstract
:1. Introduction
2. Purpose of Study
3. Materials and Methods
3.1. Research Design
3.2. Procedure and Data Analysis
4. Results
4.1. Performance and Scientific Production
4.2. Structural and Thematic Development
4.3. Thematic Evolution of the Terms
4.4. Authors with a Higher Relevance Index
5. Discussion
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Qiu, J.; Wu, Q.; Ding, G.; Xu, Y.; Feng, S. Erratum to: A survey of machine learning for big data processing. EURASIP J. Adv. Signal Process. 2016, 1, 1–16. [Google Scholar] [CrossRef] [Green Version]
- Zhou, L.; Pan, S.; Wang, J.; Vasilakos, A.V. Machine learning on big data: Opportunities and challenges. Neurocomputing 2017, 237, 350–361. [Google Scholar] [CrossRef] [Green Version]
- Liu, Y.; Zhao, T.; Ju, W.; Shi, S. Materials discovery and design using machine learning. J. Mater. 2017, 3, 159–177. [Google Scholar] [CrossRef]
- Das, S.; Dey, A.; Pal, A.; Roy, N. Applications of Artificial Intelligence in Machine Learning: Review and Prospect. IJCA 2015, 115, 31–41. [Google Scholar] [CrossRef]
- Fan, W.; Bifet, A. Mining big data: Current status, and forecast to the future. SIGKDD Explor. Newsl. 2013, 14, 1–5. [Google Scholar] [CrossRef]
- Fan, S.K.S.; Su, C.J.; Nien, H.T.; Tsai, P.F.; Cheng, C.Y. Using machine learning and big data approaches to predict travel time based on historical and real-time data from Taiwan electronic toll collection. Soft. Comput. 2018, 22, 5707–5718. [Google Scholar] [CrossRef]
- Hanzelik, P.P.; Gergely, S.; Gáspár, C.; Győry, L. Machine learning methods to predict solubilities of rock samples. J. Chemom. 2020, 34, 1–13. [Google Scholar] [CrossRef]
- Jena, R.K. Sentiment mining in a collaborative learning environment: Capitalising on big data. Behav. Inf. Technol. 2019, 38, 986–1001. [Google Scholar] [CrossRef]
- Boyd, D.; Crawford, K. Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Inf. Commun. Soc. 2012, 15, 662–679. [Google Scholar] [CrossRef]
- Daelemans, W.; Hoste, V. Evaluation of machine learning methods for natural language processing tasks. In Proceedings of the LREC 2002 Third international conference on language resources and evaluation; European Language Resources Association (ELRA), Las Palmas de Gran Canaria, Spain, 29–31 May 2002; p. 6. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Menshawy, A. Deep Learning by Example: A Hands-on Guide to Implementing Advanced Machine Learning Algorithms and Neural Networks, 1st ed.; Packt Publishing: Birmingham, UK, 2018. [Google Scholar]
- Bhardwaj, A.; Di, W.; Wei, J. Deep Learning Essentials: Your Hands-on Guide to the Fundamentals of Deep Learning and Neural Network Modeling, 1st ed.; Packt Publishing: Birmingham, UK, 2018. [Google Scholar]
- Alaei, A.R.; Becken, S.; Stantic, B. Sentiment Analysis in Tourism: Capitalizing on Big Data. J. Travel Res. 2019, 58, 175–191. [Google Scholar] [CrossRef]
- Kraus, M.; Feuerriegel, S.; Oztekin, A. Deep learning in business analytics and operations research: Models, applications and managerial implications. Eur. J. Oper. Res. 2020, 281, 628–641. [Google Scholar] [CrossRef]
- Zhang, W.; Wang, P.; Sun, K.; Wang, C.; Diao, D. Intelligently detecting and identifying liquids leakage combining triboelectric nanogenerator based self-powered sensor with machine learning. Nano Energy 2019, 56, 277–285. [Google Scholar] [CrossRef]
- Zhang, Q.; Yang, L.T.; Chen, Z.; Li, P. A survey on deep learning for big data. Inf. Fusion 2018, 42, 146–157. [Google Scholar] [CrossRef]
- Serrano, E.; Bajo, J. Deep neural network architectures for social services diagnosis in smart cities. Future Gener. Comput. Syst. 2019, 100, 122–131. [Google Scholar] [CrossRef]
- Gök, A.; Waterworth, A.; Shapira, P. Use of web mining in studying innovation. Scientometrics 2015, 102, 653–671. [Google Scholar] [CrossRef] [Green Version]
- Montáns, F.J.; Chinesta, F.; Gómez-Bombarelli, R.; Kutz, J.N. Data-driven modeling and learning in science and engineering. Comptes Rendus Mécanique 2019, 347, 845–855. [Google Scholar] [CrossRef]
- Liang, X.; Fan, L.; Loh, Y.P.; Liu, Y.; Tong, S. Happy Travelers Take Big Pictures: A Psychological Study with Machine Learning and Big Data. arXiv 2017, arXiv:1709.07584. [Google Scholar]
- Manogaran, G.; Vijayakumar, V.; Varatharajan, R.; Malarvizhi, P.; Sundarasekar, R.; Hsu, C.H. Machine Learning Based Big Data Processing Framework for Cancer Diagnosis Using Hidden Markov Model and GM Clustering. Wirel. Pers. Commun. 2018, 102, 2099–2116. [Google Scholar] [CrossRef]
- Jan, B.; Farman, H.; Khan, M.; Imran, M.; Islam, I.U.; Ahmad, A.; Ali, S.; Jeon, G. Deep learning in big data Analytics: A comparative study. Comput. Electr. Eng. 2019, 75, 275–287. [Google Scholar] [CrossRef]
- Gómez-García, A.; Ramiro, M.T.; Ariza, T.; Granados, M.R. Estudio bibliométrico de Educación XX1. Educ. XX1 2012, 15, 17–41. [Google Scholar]
- Montilla, L.J. Análisis bibliométrico sobre la producción científica archivística en la Red de Revistas Científicas de América Latina y el Caribe (Redalyc) durante el período 2001–2011. Biblios 2012, 48, 1–11. [Google Scholar] [CrossRef]
- López-Belmonte, J.; Moreno-Guerrero, A.J.; López-Núñez, J.A.; Pozo-Sánchez, S. Analysis of the Productive, Structural, and Dynamic Development of Augmented Reality in Higher Education Research on the Web of Science. Appl. Sci. 2019, 9, 5306. [Google Scholar] [CrossRef] [Green Version]
- Rodríguez-García, A.-M.; López Belmonte, J.; Agreda Montoro, M.; Moreno-Guerrero, A.J. Productive, Structural and Dynamic Study of the Concept of Sustainability in the Educational Field. Sustainability 2019, 11, 5613. [Google Scholar] [CrossRef] [Green Version]
- Martínez, M.A.; Cobo, M.J.; Herrera, M.; Herrera-Viedma, E. Analyzing the Scientific Evolution of Social Work Using Science Mapping. Res. Soc. Work Pract. 2015, 25, 257–277. [Google Scholar] [CrossRef]
- Moral-Muñoz, J.A.; Herrera-Viedma, E.; Santisteban-Espejo, A.; Cobo, M.J. Software tools for conducting bibliometric analysis in science: An up-to-date review. EPI 2020, 29, 1–20. [Google Scholar] [CrossRef] [Green Version]
- Hirsch, J.E. An index to quantify an individual’s scientific research output. Proc. Natl. Acad. Sci. USA 2005, 102, 16569–16572. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Cobo, M.J.; López-Herrera, A.G.; Herrera-Viedma, E.; Herrera, F. Science mapping software tools: Review, analysis, and cooperative study among tools. J. Am. Soc. Inf. Sci. 2011, 62, 1382–1402. [Google Scholar] [CrossRef]
- López-Robles, J.R.; Otegi-Olaso, J.R.; Porto Gómez, I.; Cobo, M.J. 30 years of intelligence models in management and business: A bibliometric review. Int. J. Inf. Manag. 2019, 48, 22–38. [Google Scholar] [CrossRef]
- Montero-Díaz, J.; Cobo, M.J.; Gutiérrez-Salcedo, M.; Segado-Boj, F.; Herrera-Viedma, E. A science mapping analysis of ‘Communication’ WoS subject category (1980–2013). Comun. Rev. Científica Comun. Educ. 2018, 26, 81–91. [Google Scholar]
- Kosinski, M.; Stillwell, D.; Graepel, T. Private traits and attributes are predictable from digital records of human behavior. Proc. Natl. Acad. Sci. USA 2013, 110, 5802–5805. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Torres, K. Tendencia en la Transformación Digital Para Retailers: Omnicanalidad Soportada Por “Big Data Analytics” Para Mejorar la Experiencia del Cliente Durante su Recorrido: Análisis de Adopción en Argentina. Ph.D. Thesis, Universidad de San Andrés, Victoria, Argentina, 2017. [Google Scholar]
- Parra-González, M.E.; Segura-Robles, A. Producción científica sobre gamificación en educación: Un análisis cienciométrico. Rev. Educ. 2019, 5, 113–131. [Google Scholar]
- Aguado-López, G.; Rogel-Salazar, E.; Becerril-García, R.; Baca-Zapata, A. Presencia de universidades en la red: La brecha digital entre Estados Unidos y el resto del mundo. RUSC Univ. Knowl. Soc. J. 2009, 6, 1–18. [Google Scholar]
- Rodríguez, Á.; Mas, L. Inventario de palabras clave temáticas para la clasificación automática de noticias de televisión. An. Doc. 2011, 14, 1–24. [Google Scholar]
Configuration | Values |
---|---|
Analysis unit | Keywords authors, keywords WoS |
Frequency threshold | Keywords: P1 = (3), P2 = (2), P3 = (3), P4 = (6), P5 = (6) |
Authors: PX = (4) | |
Network type | Co-occurrence |
Co-occurrence union value threshold | Keywords: P1 = (2), P2 = (2), P3 = (2), P4 = (2), P5 = (2) |
Authors: PX = (3) | |
Normalization measure | Equivalence index |
Clustering algorithm | Maximum size: 9; Minimum size: 3 |
Evolutionary measure | Jaccard index |
Overlapping measure | Inclusion Rate |
Production by Language (a) | n |
English | 4187 |
Turkish | 16 |
German | 14 |
Spanish | 12 |
Production by research area (b) | n |
Computer Science Theory Methods | 1041 |
Computer Science Information Systems | 1010 |
Engineering Electrical Electronic | 1000 |
Computer Science Artificial Intelligence | 738 |
Production by document types (c) | n |
Article | 1934 |
Proceedings paper | 1856 |
Review | 356 |
Editorial Material | 114 |
Production by institution (d) | n |
University of California System | 122 |
Harvard University | 77 |
Chinese Academy of Sciences | 76 |
University of Texas System | 66 |
State University System of Florida | 63 |
Production by authors (e) | n |
Wang L. | 18 |
Wang Y. | 18 |
Zhang Y. | 15 |
Liu Y. | 14 |
Lee S. | 13 |
Zhang L. | 13 |
Li X. | 12 |
Liu H. | 12 |
Kim Y. | 11 |
Kumar A. | 11 |
Production by source (f) | n |
IEEE International Conference on Big Data | 165 |
IEEE Access | 94 |
Lecture Notes in Computer Science | 77 |
Procedia Computer Science | 36 |
Production by countries (g) | n |
United States | 1479 |
China | 729 |
India | 311 |
England | 302 |
Germany | 213 |
Reference | Citations |
---|---|
Kosinski, M.; Stillwell, D.; Graepel, T. Private traits and attributes are predictable from digital records of human behaviour. Proceedings of the national academy of sciences of the United States of America 2013, 110, 5802-5805. doi: 10.1073/pnas.1218772110 | 607 |
Muja, M.; Lowe, D.G. Scalable Nearest Neighbor Algorithms for High Dimensional Data. IEEE Transactions on Pattern Analysis and Machine Intelligence 2014, 36, 2227-2240. doi: 10.1109/TPAMI.2014.2321376 | 455 |
Obermeyer, Z.; Emanuel, E.J. Predicting the Future - Big Data, Machine Learning, and Clinical Medicine. New England Journal of Medicine 2016, 375, 1216-1219. doi: 10.1056/NEJMp1606181 | 397 |
Chen, X.W.; Lin, X. Big Data Deep Learning: Challenges and Perspectives. IEEE Access 2014, 2, 514-525. Doi: 10.1109/ACCESS.2014.2325029 | 272 |
Interval 2010-2015 | ||||||
Denomination | Works | Index-h | Index-g | Index-hg | Index-q2 | Citations |
Machine-learning | 165 | 22 | 46 | 31.81 | 31.81 | 2463 |
Algorithm | 14 | 9 | 14 | 11.22 | 13.42 | 318 |
Prediction | 17 | 7 | 15 | 10.25 | 10.91 | 272 |
Big-Data_Analytics | 18 | 7 | 13 | 9.54 | 15.2 | 300 |
Neural-Networks | 7 | 5 | 6 | 5.48 | 15 | 446 |
Haddop | 8 | 5 | 5 | 5 | 15.49 | 229 |
Data_streams | 5 | 3 | 4 | 3.46 | 3.46 | 20 |
Sparse-Representation | 4 | 3 | 4 | 3.46 | 10.68 | 111 |
Sentiment-Analitysis | 6 | 2 | 4 | 2.83 | 8.72 | 45 |
Suport-Vector-Machine | 4 | 2 | 4 | 2.83 | 6.32 | 30 |
Interval 2016 | ||||||
Denomination | Works | Index-h | Index-g | Index-hg | Index-q2 | Citations |
Predictions | 31 | 16 | 27 | 20.78 | 21.54 | 766 |
Data-mining | 39 | 11 | 26 | 16.91 | 17.23 | 706 |
Classification | 22 | 9 | 19 | 13.08 | 17.75 | 397 |
Networks | 12 | 9 | 12 | 10.39 | 14.07 | 533 |
Neural-networks | 13 | 6 | 13 | 8.83 | 12.49 | 320 |
Model | 8 | 6 | 7 | 6.48 | 15.49 | 212 |
Natural-language-processing | 9 | 4 | 7 | 5.29 | 11.31 | 291 |
Mapreduce | 12 | 4 | 6 | 4.9 | 6.63 | 46 |
Dynamics | 4 | 4 | 4 | 4 | 12 | 97 |
Analytics | 5 | 4 | 5 | 4.47 | 9.8 | 105 |
Mass-spectrometry | 3 | 3 | 3 | 3 | 9.8 | 88 |
Smart-Meter | 3 | 3 | 3 | 3 | 3 | 46 |
Language | 4 | 1 | 3 | 1.73 | 5.66 | 34 |
Lung-Cancer | 2 | 2 | 2 | 2 | 11.92 | 88 |
Privacy | 4 | 2 | 2 | 2 | 16.37 | 241 |
ITS | 3 | 1 | 1 | 1 | 2.45 | 6 |
Harnessing-interference | 2 | 1 | 1 | 1 | 3.46 | 12 |
Interval 2017 | ||||||
Denomination | Works | Index-h | Index-g | Index-hg | Index-q2 | Citations |
Machine-learning | 155 | 22 | 43 | 30.76 | 31.11 | 2022 |
System | 16 | 9 | 16 | 12 | 13.42 | 377 |
Algorithm | 29 | 9 | 22 | 14.07 | 15.87 | 514 |
Suppoter-Vector-Machine | 15 | 8 | 13 | 10.2 | 13.56 | 434 |
Social-Media | 16 | 8 | 12 | 9.8 | 10.58 | 167 |
Framework | 12 | 6 | 9 | 7.35 | 7.75 | 342 |
Mapreduce | 43 | 6 | 8 | 6.93 | 8.12 | 104 |
Surveillance | 5 | 3 | 3 | 3 | 4.24 | 21 |
Features | 4 | 3 | 3 | 3 | 6.48 | 239 |
Cloud | 6 | 3 | 4 | 3.46 | 8.31 | 58 |
Apache-Spark | 13 | 2 | 5 | 3.16 | 7.35 | 42 |
Analytics | 5 | 2 | 4 | 2.83 | 14.83 | 119 |
Prevention | 3 | 1 | 1 | 1 | 1.73 | 3 |
Text-mining | 4 | 1 | 2 | 1.41 | 3.46 | 13 |
Interval 2018 | ||||||
Denomination | Works | Index-h | Index-g | Index-hg | Index-q2 | Citations |
Machine-learning | 484 | 26 | 39 | 31.84 | 31.84 | 2895 |
Callenges | 22 | 10 | 17 | 13.04 | 17.32 | 378 |
Random-Forest | 28 | 9 | 19 | 13.08 | 15 | 385 |
Precision-Medicine | 23 | 8 | 14 | 10.58 | 12.33 | 213 |
Data-analytics | 22 | 7 | 16 | 10.58 | 12.69 | 267 |
Managements | 16 | 6 | 11 | 8.12 | 11.75 | 142 |
Mortaly | 11 | 5 | 8 | 6.32 | 8.37 | 74 |
Diagnosis | 14 | 5 | 9 | 6.71 | 7.42 | 99 |
Networks | 8 | 4 | 5 | 4.47 | 10.39 | 83 |
Technology | 5 | 4 | 4 | 4 | 6 | 39 |
Sentiment-analysis | 24 | 3 | 7 | 4.58 | 6.24 | 63 |
Feature-Selection | 8 | 3 | 6 | 4.24 | 5.48 | 47 |
Spark | 6 | 2 | 3 | 2.45 | 5.1 | 19 |
Cyber-Security | 3 | 1 | 2 | 1.41 | 4.8 | 24 |
Interval 2019 | ||||||
Denomination | Works | Index-h | Index-g | Index-hg | Index-q2 | Citations |
Machine-learning | 519 | 13 | 19 | 15.72 | 16.52 | 948 |
Internet | 37 | 5 | 8 | 6.32 | 5 | 86 |
Artificial-neural-networks | 19 | 4 | 6 | 4.9 | 6.93 | 54 |
Social-media | 20 | 4 | 7 | 5.29 | 5.29 | 60 |
Framework | 21 | 3 | 3 | 3 | 3.46 | 26 |
Optimization | 23 | 3 | 5 | 3.87 | 3.87 | 35 |
Risk | 12 | 3 | 7 | 4.58 | 7.94 | 88 |
Recognition | 11 | 3 | 4 | 3.46 | 3 | 19 |
Feature-selection | 13 | 2 | 2 | 2 | 2.83 | 10 |
Networks | 15 | 2 | 2 | 2 | 2.83 | 7 |
Mapreduce | 31 | 2 | 5 | 3.16 | 6.48 | 36 |
Health | 11 | 2 | 3 | 2.45 | 3.16 | 15 |
Information | 9 | 2 | 3 | 2.45 | 3.74 | 10 |
Cancer | 13 | 1 | 2 | 1.41 | 2 | 8 |
Unsupervised-learning | 6 | 1 | 2 | 1.41 | 2 | 8 |
Big-Data-Applications | 5 | 1 | 2 | 1.41 | 2 | 6 |
Decision-Making | 4 | 1 | 1 | 1 | 4.9 | 24 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
López Belmonte, J.; Segura-Robles, A.; Moreno-Guerrero, A.-J.; Parra-González, M.E. Machine Learning and Big Data in the Impact Literature. A Bibliometric Review with Scientific Mapping in Web of Science. Symmetry 2020, 12, 495. https://doi.org/10.3390/sym12040495
López Belmonte J, Segura-Robles A, Moreno-Guerrero A-J, Parra-González ME. Machine Learning and Big Data in the Impact Literature. A Bibliometric Review with Scientific Mapping in Web of Science. Symmetry. 2020; 12(4):495. https://doi.org/10.3390/sym12040495
Chicago/Turabian StyleLópez Belmonte, Jesús, Adrián Segura-Robles, Antonio-José Moreno-Guerrero, and María Elena Parra-González. 2020. "Machine Learning and Big Data in the Impact Literature. A Bibliometric Review with Scientific Mapping in Web of Science" Symmetry 12, no. 4: 495. https://doi.org/10.3390/sym12040495
APA StyleLópez Belmonte, J., Segura-Robles, A., Moreno-Guerrero, A. -J., & Parra-González, M. E. (2020). Machine Learning and Big Data in the Impact Literature. A Bibliometric Review with Scientific Mapping in Web of Science. Symmetry, 12(4), 495. https://doi.org/10.3390/sym12040495