1. Allamanis, M. and Sutton, C., 2014, November. Mining idioms from source code. In
Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of
Software Engineering (pp. 472-483).
2. Alon, U., Zilberstein, M., Levy, O. and Yahav, E., 2019. code2vec: Learning distributed
representations of code. Proceedings of the ACM on Programming Languages, 3(POPL),
pp.1-29.
3. Alreshedy, K., Dharmaretnam, D., German, D.M., Srinivasan, V. and Gulliver, T.A.,
2018. SCC: automatic classification of code snippets. arXiv preprint arXiv:1809.07945.
4. Baquero, J.F., Camargo, J.E., Restrepo-Calle, F., Aponte, J.H. and González, F.A., 2017.
Predicting the programming language: Extracting knowledge from stack overflow posts.
In Advances in Computing: 12th Colombian Conference, CCC 2017, Cali, Colombia,
September 19-22, 2017, Proceedings 12 (pp. 199-210). Springer International Publishing.
5. Bengio, R., Ducharme, R., & Vincent, P. (2003). A neural probabilistic language model.
Journal of Machine Learning Research, 3, 1137-1155.
6. Biau, G., & Scornet, E. (2016). A random forest guided tour. Test, 25, 197-227.
7. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of
Machine Learning Research, 3(Jan), 993-1022.
8. Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching word vectors with
subword information. Transactions of the Association for Computational Linguistics, 5,
135-146.
9. Buratti, L., Pujar, S., Bornea, M., McCarley, S., Zheng, Y., Rossiello, G., Morari, A.,
Laredo, J., Thost, V., Zhuang, Y. and Domeniconi, G., 2020. Exploring software
naturalness through neural language models. arXiv preprint arXiv:2006.12641.
10. Causa, O., Abendschein, M., Luu, N., Soldani, E. and Soriolo, C., 2022. The post-
COVID-19 rise in labour shortages.
11. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE:
Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research,
16, 321-357.
12. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. P.
(2011). Natural language processing (almost) from scratch. CoRR abs/1103.0398.