Compression of Time Series Classification Model MC-MHLF using Knowledge Distillation

A Gengyo, K Tamura - 2021 IEEE International Conference on …, 2021 - ieeexplore.ieee.org
A Gengyo, K Tamura
2021 IEEE International Conference on Systems, Man, and …, 2021ieeexplore.ieee.org
Classification of time series measured by sensor devices has been attracting the attention of
researchers and industrial practitioners of deep learning because of its practical
applications, such as anomaly detection and state estimation. With the wide spread use of
deep learning, several deep learning models have been developed for time series
classification, and these models achieve satisfactory performance compared with
conventional machine learning. Actual systems should utilize small models of deep learning …
Classification of time series measured by sensor devices has been attracting the attention of researchers and industrial practitioners of deep learning because of its practical applications, such as anomaly detection and state estimation. With the wide spread use of deep learning, several deep learning models have been developed for time series classification, and these models achieve satisfactory performance compared with conventional machine learning. Actual systems should utilize small models of deep learning that can be embedded in edge devices for practical use; therefore, model compression and acceleration for deep neural networks are currently important requirements. Knowledge distillation for deep neural networks is the primary approach for compressing models. However, it has a performance limitation in terms of model size and over-fitting when used for time series classification with the Multi-Channel MACD-Histogram-based LSTM-FCN (MC-MHLF) model. The MC-MHLF model is a high-accuracy deep learning model for classifying time series. To address the aforementioned problem, we propose a novel learning model for model compression using knowledge distillation. A comparison experiment using the UCR time series classification archive showed that our proposed learning model yields a higher accuracy for a smaller model than when only distillation is used.
ieeexplore.ieee.org
Showing the best result for this search. See all results