The Temperature Prediction of Permanent Magnet Synchronous Machines Based on Proximal Policy Optimization

Cen, Yuefeng; Zhang, Chenguang; Cen, Gang; Zhang, Yulai; Zhao, Cheng

doi:10.3390/info11110495

Open AccessArticle

The Temperature Prediction of Permanent Magnet Synchronous Machines Based on Proximal Policy Optimization

by

Yuefeng Cen

^1,*,

Chenguang Zhang

¹,

Gang Cen

¹,

Yulai Zhang

¹ and

Cheng Zhao

²

¹

School of Information and Electronic Engineering, ZheJiang University of Seience and Technology, Hangzhou 310023, China

²

School of Economics, Zhejiang University of Technology, Hangzhou 310014, China

^*

Author to whom correspondence should be addressed.

Information 2020, 11(11), 495; https://doi.org/10.3390/info11110495

Submission received: 11 September 2020 / Revised: 17 October 2020 / Accepted: 20 October 2020 / Published: 23 October 2020

Download

Browse Figures

Versions Notes

Abstract

:

Accurate temperature prediction plays an important role in the thermal protection of permanent magnet synchronous motors. A temperature prediction method of permanent magnet synchronous machines (PMSMs) based on proximal policy optimization is proposed. In the proposed method, the actor-critic framework of reinforcement learning is introduced to model the effective temperature prediction mechanism, and the correlations between the input features are then analyzed to select the appropriate input features. Finally, the simplified proximal policy optimization algorithm is introduced to optimize the value of the prediction temperature of PMSMs. Experimental results reveal the high accuracy and reliability of the proposed method compared with an exponential weighted moving average method (EWMA), a recurrent neural network (RNN), and long short-term memory (LSTM).

Keywords:

permanent magnet synchronous machines (PMSMs); temperature prediction; correlation analysis; reinforcement learning; proximal policy optimization

1. Introduction

Temperature prediction of permanent magnet synchronous machines (PMSMs) has been a research focus in the field of motor protection. In recent years, researchers have made many attempts to predict the temperature of PMSMs [1], since temperature is an important factor for PMSMs to work. Most researchers have focused on the thermal model of the motor. For example, the temperature equivalent model based on hardware-in-loop (HIL) was proposed to effectively predict the motor temperature [2], but this method required high calculation complexity. An equivalent thermal transfer model with two heat nodes for a permanent magnet synchronous motor was also proposed [3]. The thermal effect of the current and stator frequency was considered. The predicted results verified the rationality of this transfer model. Mohamed et al. [4] constructed a Lumped Parameter Thermal Network (LPTN) to calculate important component temperatures inside PMSMs. The air temperature between permanent magnets was considered in this model. However, the computational complexity of the model is high. Wallscheid et al. [5] proposed a dynamic measurement method by introducing the magnetic flux observer into the time-dependent dispersion model of PMSMs. However, this approach is not universal because it is strongly correlated to machine speed. Wallscheid et al. [6] examined the prediction performance of flux observers in PMSMs, and the results illustrated that the worst case of the Euclidean norm is less than 10 K. Lan et al. [7] established a temperature thermal network with 38 nodes by analyzing the temperature fields of PMSMs, which accurately described the temperature values of each component inside the motor. However, the acquisition of overheat spots lacked optimization. Sciascera et al. [8] built a variable heat model of an LPTN to improve the prediction accuracy of the traditional LPTN, which requires low computational complexity. In addition, this model provides an effective fine-tuning experience of model parameters. Liu et al. [9] investigated the signal injection method for estimating the temperature of the motor stator windings, but the temperature estimation results under motor overload were not given. Du et al. [10] established a finite element model of the electromagnetic fields of the motor using finite element analysis. The model obtained a temperature distribution of major components inside the motor under a rated working condition by calculating motor loss and a coefficient of thermal conductivity. In conclusion, the above models aimed to establish the empirical formulas of motor temperature. However, these processes of modeling design and the factors adopted depend on prior experience. In this work, temperature prediction is seen as a time series problem, and the temperature change of motor components can be fitted dynamically with additional degrees of freedom due to the capability of the dynamic tuning in PPO-RL.

The development of artificial intelligence technology has shown great potential in the field of temperature prediction. Xu et al. [11] proposed a novel deep-learning-based indoor temperature prediction method for public buildings, which verified the prediction accuracy in the direction of indoor temperature change and its disadvantage in the horizontal direction. Liu et al. [12] analyzed the time dependence of ocean temperatures at multiple depths and proposed a time-dependent ocean temperature prediction method, and the test results showed a better predictive performance than both support vector regression (SVR) and a multilayer perceptron regressor (MLPR). Wallschied et al. [13] verified the feasibility of LSTM on temperature prediction. However, the introduction of memory blocks in LSTM made the topological relationships complex, thus increasing the computing complexity.

In order to provide an accurate prediction method, we propose a method based on correlation analysis (CA) and proximal policy optimization (PPO) [14]. It selects the input features by correlation analysis and optimizes the model training process with a PPO algorithm. The remainder of the paper is organized as follows: The dataset and the correlation analysis process are described in Section 2. The rationale of our proposed method is presented in Section 3. The predictive model is validated and compared with other predictive networks in Section 4. Finally, a conclusion is given in Section 5.

2. Dataset and Correlation Analysis

In order to improve the prediction accuracy, an effective data processing method is proposed in this paper. The specific process is shown in Figure 1.

The benchmark data are firstly sampled, and the correlation analysis is then conducted on the sampled data using the Pearson correlation coefficient (PCC) and the p-value. After the correlation analysis, the data features that are significantly negatively correlated with the predicted target are discarded. Meanwhile, some additional features such as the voltage magnitude

u_{s}

, the current magnitude

i_{s}

, and the electric apparent power

S_{e l}

are added to the processed sampled data to enrich the dataset and improve prediction accuracy.

2.1. Data Description

The benchmark data used in the experiment came from the Kaggle data science competition platform. The measurement and collection of the data were conducted by the University of Paderborn in Germany, and the benchmark data were normalized. The definitions of parameters and symbols of the column labels in the benchmark data are shown in Table 1.

ϑ_{s y}

,

ϑ_{s t}

and

ϑ_{s w}

were chosen as the test objectives for the experiment. The data contain 990,000 pieces. The experiment consisted of 52 measurement sessions, and each measurement session can be distinguished by

S_{i d}

. All measurement records were measured at a sampling frequency of 2 Hz on a test bench equipped with a three-phase permanent magnet synchronous motor.

The benchmark data involve the main thermal process of a PMSM. In our work, 30,000 samples were taken from the benchmark data at random. The experiment selected 300 test samples as the test set, and the rest of the samples were used as the training set.

2.2. Correlation Analysis

Equipment failure often occurs in the process of continuous data acquisition, which will cause partial distortion in the benchmark data and become interference factors of the prediction. Therefore, Pearson correlation coefficient analysis [15] was adopted to observe the correlation between different features, and the p-value [16] was used to measure the related level.

The general expression of the Pearson correlation coefficient is as follows:

r_{x y} = \frac{cov (x, y)}{σ_{x} σ_{y}} = \frac{E [(x - μ_{x}) (y - μ_{y})]}{σ_{x} σ_{y}}

(1)

where

σ_{x}

and

σ_{y}

are the standard deviation of variables x and y, respectively. Additionally,

cov (x, y)

is the covariance of the two variables, and

μ_{x}

and

μ_{y}

are the average values of variables x and y, respectively. In general, if the covariance of x and y is larger than 0, the variables x and y are positively correlated. If the covariance of x and y is equal to 0, the variable x the variable y are independent. Otherwise, the variable x and the variable y are negatively correlated.

The correlations between data features are discussed, and the significance level p-value is also evaluated. It is generally acknowledged that there is a significant difference between the two groups of data characteristics when the p-value is less than 0.05, and the difference is of particular significance when the p-value is less than 0.01.

In order to evaluate the correlation between the monitoring target and the benchmark data, the Pearson correlation values of each feature through the thermal diagram of the sampling data are analyzed in Figure 2. The values of the correlation coefficients between

ϑ_{s y}

and

T_{m}

, the current d-component

i_{d}

, and the current q-component

i_{q}

of PMSMs are all negatively correlated. The joint distribution density diagrams between

ϑ_{s y}

and the above three features are shown in Figure 3.

The correlation coefficients between

ϑ_{s t}

and the voltage d-component

u_{d}

, the motor

T_{m}

, the current d-component

i_{d}

, and the current q-component

i_{q}

are shown in Figure 2, respectively. The values of the above correlation coefficients are all less than 0. Figure 4 shows the joint distribution of

ϑ_{s t}

with the above features to further show the correlation degree of the features.

In the same way, it can be seen in Figure 2 that the correlation coefficients of

ϑ_{s w}

with the voltage d-component ud and the current d-component

i_{d}

are negative, respectively, so there are negative correlations between the features. Meanwhile, the joint distribution density diagrams of the target feature

ϑ_{s w}

are shown in Figure 5.

On the basis of the sampled data set, some additional feature quantities are considered in this paper. These features include the voltage magnitude

u_{s}

based on their dq-components, the current magnitude

i_{s}

based on their dq-components, and the electric apparent power

S_{e l}

, respectively. The specific calculation methods are defined as follows:

u_{s} = \sqrt{u_{d}^{2} + u_{q}^{2}}

(2)

i_{s} = \sqrt{i_{d}^{2} + i_{q}^{2}}

(3)

S_{e l} = u_{s} * i_{s}

(4)

where

u_{d}

and

u_{q}

are the components of voltage on d-component and q-component respectively,

i_{d}

and

i_{q}

are the components of the current on d-component and q-component, respectively, and * represents dot product operation.

3. The Proposed Method

3.1. Reinforcement Learning

In order to accurately predict the temperature of the main components of the PMSMs, the Actor-Critic framework of reinforcement learning (RL) [17] is introduced into the predictive network. The general structure of the Actor-Critic learning framework is shown in Figure 6.

The Actor network and the Critic network are the main parts of this framework. The interactive state values come from the record datasets of PMSMs in the environment, and the dynamic selection process of state values are the basis of model training.

The target function of actor training can be dynamically adjusted by the feedback function. Therefore, the feedbacks of the Critic network to the Actor network are particularly important in the prediction process. In addition, the Nadam algorithm is used in the gradient optimization process.

3.2. Proximal Policy Optimization

The PPO algorithm is one of the policy gradient methods for RL proposed by OpenAI in 2017, and this algorithm is often applied in the control process of intelligent agents. The algorithm can easily achieve adjustments of hyper-parameters during the training of agents. In each iteration, it will attempt to minimize the objective function and recalculate the new update strategy. The objective function of the PPO algorithm can be defined by Formula

(5)

:

L^{C L I P} (θ) = \hat{E} [min (r_{t} (θ) {\hat{A}}_{t}, clip (r_{t} (θ), 1 - ε, 1 + ε) {\hat{A}}_{t})]

(5)

where

ε

is a constant, and

{\hat{A}}_{t}

is the feedback of the Critic network. Furthermore,

r_{t} (θ)

is the ratio of the new strategy and the old strategy, and its calculation method is represented by Formula

(6)

:

r_{t} (θ) = \frac{π_{θ} (a_{t} | s_{t})}{π_{θ_{o l d}} (a_{t} | s_{t})}

(6)

where

π_{θ} (a_{t} | s_{t})

is the updated new policy,

π_{θ_{o l d}} (a_{t} | s_{t})

is the corresponding old policy,

a_{t}

and

s_{t}

are the action and state values at time t, respectively.

As shown in Formula

(5)

, the objective function

L^{C L I P} (θ)

includes two main parts: The first part is a product of the strategy ratio

r_{t} (θ)

and the feedback value

{\hat{A}}_{t}

. The second part is a product about

r_{t} (θ)

and the feedback value

{\hat{A}}_{t}

after clipping in the interval

[1 - ε, 1 + ε]

. Finally, the minimum value of the two parts can be obtained by Formula

(5)

.

The definition of strategy ratio

r_{t} (θ)

is given in Formula

(7)

, where

o u t_{t}

represents the output value at time t, and

y_{t}

represents the real output value. Additionally, the output

o u t_{t}

is given by the Actor network, and the loss function of the critic network is selected as the feedback

{\hat{A}}_{t}

. The strategy ratio

r_{t} (θ)

and

{\hat{A}}_{t}

are defined as follows:

r_{t} (θ) = \frac{o u t_{t}}{y_{t}}

(7)

{\hat{A}}_{t} = \frac{1}{N} \sum {(o u t_{t} - y_{t})}^{2}

(8)

where N denotes the number of all predicted values.

3.3. Model Construction and Prediction

The temperature prediction model of the PMSM is shown in Figure 7. The Actor network and the Critic network include an input layer and an output layer, respectively, and

h_{i}

(i = 1, 2, . . ., 5)

is the hidden layer.

The definition methods of hidden layers in the model are as follows:

h_{1} = relu (x_{t} * w_{1} + b_{1})

(9)

h_{i} = relu (h_{i - 1} * w_{i} + b_{i})

(10)

o u t_{t} = relu (h_{5} * w_{o u t} + b_{o u t})

(11)

where

x_{t}

is the input data matrix at time t, and * is an element multiplication sign.

w_{i}

,

b_{i}

and

h_{i}

, respectively, represent the weight, the bias, and the output of each hidden layer for the network,

(i = 1, 2, . . ., 5)

. The corresponding weights and bias of the network output layer are

w_{o u t}

and

b_{o u t}

, respectively, and the final predicted value of the network at time t is

o u t_{i}

. Further,

θ

and

θ_{o u t}

, respectively, represent the parameter vectors before and after the policy update.

After the completion of data processing and model construction, the loss objective function of the training model is determined by Formula

(5)

. The Actor network and the Critic network share five hidden layers in this model, and the numbers of these network neurons are 512, 256, 128, 64, and 32, respectively. Moreover, the relu function is used as an activation function in each hidden layer.

The model chooses the input sequence with step size 5 as the input data. In the process of iteration training, the target

L^{C L I P} (θ)

of model training is calculated according to the

{\hat{A}}_{t}

value, and

r_{t} (θ)

is updated at each step.

In order to accelerate the convergence of the objective function and make the gradient reach the global minimum more quickly, the Nadam algorithm is used to optimize the training process. The correction value

{\hat{g}}_{t}

of gradient

g_{t}

is introduced into the Nadam algorithm and compared with Adam at time t, and the gradient

{\hat{g}}_{t}

is defined by Formula

(12)

. In addition, the updated gradient

Δ θ_{t}

is calculated by Formula

(13)

. Finally, the predicted output values can be obtained by the trained model.

{\hat{g}}_{t} = \frac{g_{t}}{1 - \prod_{i - 1}^{t} u_{i}}

(12)

Δ θ_{t} = - η * \frac{{\bar{m}}_{t}}{\sqrt{{\hat{n}}_{t} + x}}

(13)

Here,

u_{i}

is the momentum factor of the first moment estimation at time i,

η

is the learning rate of the Nadam algorithm,

{\hat{n}}_{t}

is the correction value of the second raw moment estimation of gradient at time t, and

ξ

is a positive number close to but not equal to zero.

4. Experiments and Results Analysis

4.1. Experimental Environment and Parameter Definition

The experimental environment in this experiment consisted of an Intel(R) Core(TM) i5-8250U 3.4 GHz quad-core processor with a 16 GB memory. The operating system was 64 bit Windows 10, the programming language version was Python3.7.5, and the deep learning framework version was Tensorflow1.13.1. The hyper-parameters considered during the experiment are in Table 2.

In addition to the types of parameters in the table that are self-explanatory, some of the hyper-parameters not specifically mentioned should be interpreted as follows: When initializing the weights of the prediction network, the simplest method would be to assign random values from the interval

[- 1, 1]

. In addition, more complex and efficient initializing schemes of weights can be considered, such as unit normal distribution or uniform distribution.

4.2. Model Evaluation

The goal of this paper is to predict the temperature of the PMSMs at the next moment. Therefore, the most effective evaluation methods for the above PMSM temperature prediction are the root mean square error (RMSE) [18] and the mean absolute error (MAE) [19]. As shown in Equations

(14)

and

(15)

, the RMSE and MAE are calculated as follows:

R M S E = \sqrt{\frac{1}{N} \sum_{j = 0}^{N - 1} {(R_{j} - P_{j})}^{2}}

(14)

M A E = \frac{1}{N} \sum_{i = 0}^{N - 1} |R_{i} - P_{i}|

(15)

where

R_{j}

represents the measured temperature of the target,

P_{j}

represents the predicted temperature of the target, and N denotes the number of test data.

In order to comprehensively evaluate the prediction performance of different methods, the Euclidean norm

L_{2}

[20] and worst-case error

L_{\infty}

[21] are introduced to measure the approximation degree of the prediction target. The specific evaluation indexes are defined as follows:

L_{2} = \sqrt{\sum_{j = 0}^{N - 1} {(R_{j} - P_{j})}^{2}}

(16)

L_{\infty} = max_{i} \sum_{j = 0}^{N - 1} |e_{i j}|

(17)

where

R_{j}

,

P_{j}

and N represents the same elements as in RMSE, and

|e_{i j}|

indicates the sum of absolute values for all error in row i.

4.3. Experimental Results and Analysis

In order to evaluate the overall performance of our proposal and the comparative methods on the sampled dataset, the trend prediction results for

ϑ_{s y}

,

ϑ_{s t}

and

ϑ_{s w}

are demonstrated, respectively. As shown in Figure 8, Figure 9 and Figure 10, the prediction curves that we proposed fit the real curves in the prediction period best. Although the curves of LSTM and the RNN conformed to the real target curves at first, they largely deviate at the end. Moreover, the fitted curves given by the EWMA method have a large delay characteristic. The x-coordinates of the curve represent the prediction period of the test data, and the y-coordinates are the prediction targets in Figure 8, Figure 9 and Figure 10.

The relating evaluation indicators for

ϑ_{s y}

,

ϑ_{s t}

and

ϑ_{s w}

are provided on Table 3, Table 5 and Table 7 respectively, including RMSE, MAE, Euclidean norm

L_{2}

and Infinite norm

L_{\infty}

.

The quantitative evaluation indicators of the temperature prediction of

ϑ s y

with four prediction methods are given in Table 3. According to Table 3, the prediction error values of the prediction model proposed in this paper are the lowest compared with the other three methods. In the optimal case, the RMSE value and the

L_{2}

of PPO-RL decreased by 0.1540 and 2.6624, compared with the LSTM network.

In order to compare the computational complexity of the four methods, the calculation time of each method on the training set and test set was given after 30 iterations. It can be seen in Table 4 that the computational complexity of the PPO-RL is relatively high for

ϑ_{s y}

on the training set. By contrast, it shows a low complexity for

ϑ_{s y}

on the test set, which is 0.38 min lower than that of the LSTM.

The quantitative evaluation indicators of temperature prediction of stator tooth with four prediction methods are given in Table 5. As shown, the PPO-RL method proposed in this paper has achieved an excellent performance. The RMSE value and MAE value of PPO-RL decreased by 0.0117 and 0.0424, respectively, and its Euclidean norm

L_{2}

reached the minimum value.

As can be seen in Table 6, LSTM has the lowest computational complexity for

ϑ_{s t}

on the training set, while compared with LSTM, the RNN, and EWMA, PPO-RL has the lowest computational complexity of

ϑ_{s t}

on the test set.

The quantitative evaluation indicators of temperature prediction of

ϑ_{s w}

with four prediction methods are given in Table 7. It can be seen in the table that the PPO-RL model has a lower prediction error and can obtain a higher prediction accuracy. It is worth noting that the LSTM network has a lower error than the RNN network and the EWMA method in the prediction experiment of the

ϑ_{s t}

. The errors in the prediction of the

ϑ_{s y}

and

ϑ_{s} w

are high.

Table 8 shows the computing time analysis of the four methods on the training set and test set for

ϑ_{s w}

. The PPO-RL shows the optimal computational complexity on the test set for

ϑ_{s w}

, while the LSTM has the greatest computational complexity on the test set, which is 0.79 min higher than PPO-RL.

5. Conclusions

This paper systematically elaborates on the research status and shortcomings of traditional thermal network and machine learning methods on PMSM temperature prediction. Based on the problems found in the literature review, a temperature prediction method of PMSM based on proximal optimization is proposed. This method can obtain a better performance by adjusting the network structure and minimizing the objective function of PPO.

The prediction performance of the proposed method as well as three other classical machine learning networks were explored to validate the applicability and validity of this method. The results further show that the performance of an LSTM neural network is uncertain with regard to the test samples, which increases the difficulty of solving the global optimal values in the training process. In future research work, the improvement of the real-time performance of this method should be considered.

Author Contributions

Y.C.: data reduction, project administration, and experimentation; C.Z. (Chenguang Zhang): methodology, validation, and writing—original draft preparation; G.C.: review and editing; Yulai Zhang: conceptualization and structuration; C.Z. (Cheng Zhao): translation and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Foundation of China (grant numbers: Nsfc61803337 and 61902349).

Acknowledgments

The author would like to thank the editors and anonymous reviewers for their suggestions to improve the quality of the paper. Meanwhile, we would like to thank the Kaggle data science competition platform and the University of Paderborn in Germany for their dataset: https://www.kaggle.com/wkirgsn/electric-motor-temperature.

Conflicts of Interest

The authors declare that there is no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Zhu, Y.; Xiao, M.; Lu, K.; Wu, Z.H.; Tao, B. A Simplified Thermal Model and Online Temperature Estimation Method of Permanent Magnet Synchronous Motors. Appl. Sci. 2019, 9, 3158. [Google Scholar] [CrossRef] [Green Version]
Li, Y.Z.; Zhu, S.J.; Li, Y.H.; Lu, Q. Temperature Prediction and Thermal Boundary Simulation Using Hardware-in-Loop Method for Permanent Magnet Synchronous Motors. IEEE/ASME Trans. Mechatron. 2016, 21, 276–287. [Google Scholar] [CrossRef]
Kral, C.; Haumer, A.; Lee, S.B. A Practical Thermal Model for the Estimation of Permanent Magnet and Stator Winding Temperatures. IEEE Trans. Power Electron. 2014, 29, 455–464. [Google Scholar] [CrossRef]
Mohamed, A.H.; Hemeida, A.; Rashekh, A.; Vansompel, H.; Arkkio, A.; Sergeant, P. A 3D Dynamic Lumped Parameter Thermal Network of Air-Cooled YASA Axial Flux Permanent Magnet Synchronous Machine. Energies 2018, 11, 16. [Google Scholar] [CrossRef] [Green Version]
Wallscheid, O.; Specht, A.; Böecker, J. Determination of Rotor Temperature for an Interior Permanent Magnet Synchronous Machine Using a Precise Flux Observer. In Proceedings of the 2014 International Power Electronics Conference (IPEC-Hiroshima 2014—ECCE ASIA), Hiroshima, Japan, 18–21 May 2014; pp. 1501–1507. [Google Scholar]
Wallscheid, O.; Kirchgäessner, W.; Böecker, J. Investigation of Long Short-Term Memory Networks to Temperature Prediction for Permanent Magnet Synchronous Motors. In Proceedings of the 2017 International Joint Conference On Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 1940–1947. [Google Scholar]
Lan, Z.Y.; Wei, X.H.; Li, H.R.; Liao, K.L.; Chen, L.H. Thermal Analysis of PMSM Based on Lumped Parameter Thermal Network Method. J. Electr. Eng. 2017, 12, 13–16. [Google Scholar]
Sciascera, C.; Giangrande, P.; Papini, L.; Gerada, C.; Galea, M. Analytical Thermal Model for Fast Stator Winding Temperature Prediction. IEEE Trans. Ind. Electron. 2017, 64, 6116–6126. [Google Scholar] [CrossRef]
Liu, P.; Wang, X.; Sun, Q.Z.; Huang, S.D.; Tu, C.M.; Yang, W.L. Signal injection strategy optimization of stator winding temperature estimation for permanent magnet synchronous motor. Electr. Mach. Control 2019, 23, 18–26. [Google Scholar]
Du, A.M.; Zhang, D.X.; Sun, M.M.; Yun, Z.Z. Research on Temperature Field of the Permanent Magnet Synchronous Motors for Hybrid Vehicles Cooled by Oil. Automob. Technol. 2019, 4, 34–39. [Google Scholar]
Xu, C.L.; Chen, H.X.; Wang, J.Y.; Guo, Y.B.; Yuan, Y. Improving prediction performance for indoor temperature in public buildings based on a novel deep learning method. Build. Environ. 2019, 148, 128–135. [Google Scholar] [CrossRef]
Liu, J.; Zhang, T.; Han, G.J.; Gou, Y. TD-LSTM: Temporal Dependence-Based LSTM Networks for Marine Temperature Prediction. Sensors 2018, 18, 3797. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Wallscheid, O.; Specht, A.; Böecker, J. Observing the Permanent-Magnet Temperature of Synchronous Motors Based on Electrical Fundamental Wave Model Quantities. IEEE Trans. Ind. Electron. 2017, 64, 3921–3929. [Google Scholar] [CrossRef]
Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal Policy Optimization algorithms. arXiv 2017, arXiv:1707. 06347. [Google Scholar]
Berthold, M.; Höppner, F. On Clustering Time Series Using Euclidean Distance and Pearson Correlation. arXiv 2016, arXiv:1601. 02213. [Google Scholar]
Wasserstein, R.L.; Lazar, N.A. The ASA’s Statement on p-values: Context, Process, and Purpose. Am. Stat. 2016, 70, 129–133. [Google Scholar] [CrossRef] [Green Version]
Wang, Z.Y.; Bapst, V.; Heess, N.; Mnih, V.; Munos, R.; Kavukcuoglu, K.; Freitas, N.D. Sample Efficient Actor-Critic with Experience Replay. arXiv 2016, arXiv:1611.01224. [Google Scholar]
Hietaharju, P.; Ruusunen, M.; Leiviskä, K. A Dynamic Model for Indoor Temperature Prediction in Buildings. Energies 2018, 11, 1477. [Google Scholar] [CrossRef] [Green Version]
Wu, D.; Wang, H.; Seidu, R. Smart data driven quality prediction for urban water source management. Future Gener. Comput. Syst. 2020, 107, 418–432. [Google Scholar] [CrossRef]
Carlini, N.; Wagner, D. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods. arXiv 2017, arXiv:1705. 07263. [Google Scholar]
Carlini, N.; Wagner, D. Towards Evaluating the Robustness of Neural Networks. arXiv 2016, arXiv:1608.04644. [Google Scholar]

Figure 1. The process of data correlation analysis.

Figure 2. The feature thermal diagram of sampled data.

Figure 3. The kernel density estimation of

ϑ_{s y}

.

Figure 3. The kernel density estimation of

ϑ_{s y}

.

Figure 4. The kernel density estimation of

ϑ_{s t}

.

Figure 4. The kernel density estimation of

ϑ_{s t}

.

Figure 5. The kernel density estimation of

ϑ_{s w}

.

Figure 5. The kernel density estimation of

ϑ_{s w}

.

Figure 6. The Actor-Critic learning framework.

Figure 7. A prediction model of motor temperature based on proximal policy optimization-reinforcement learning (PPO-RL).

Figure 8. Fit curve of stator yoke temperature.

Figure 9. Fit curve of stator tooth temperature.

Figure 10. Fit curve of stator winding temperature.

Table 1. The benchmark data column label.

Parameters	Symbols
ambient temperature	$ϑ_{a}$
coolant temperature	$ϑ_{c}$
voltage d-component	$u_{d}$
voltage q-component	$u_{q}$
current d-component	$i_{d}$
current q-component	$i_{q}$
motor speed	$n_{m e c h}$
torque	$T_{m}$
Permanent Magnet temperature	$ϑ_{p m}$
stator yoke temperature	$ϑ_{s y}$
stator tooth temperature	$ϑ_{s t}$
stator winding temperature	$ϑ_{s w}$
unique id	$S_{i d}$

Table 2. Hyper-parameter sets of the optima found in experiments.

Hyper-Parameter	Models
architecture	RNN	LSTM	PPO-RL	EWMA
hidden layers	2	3	5	-
units per layer	[40,20]	[100,50,1]	[512,256,128,64,32]	-
weight	normal	normal	normal	-
optimizer	adam	rmsprop	nadam	-
learn rate	0.01	0.01	$[2 . 10^{- 5}, 1 . 10^{- 5}]$	-
gaussian noise	$1 . 10^{- 3}$	$1 . 10^{- 3}$	$1 . 10^{- 3}$	$1 . 10^{- 3}$
epsilon( $ε$ )	-	$1 . 10^{- 6}$	0.02	-

Table 3. Evaluation results for different predictive methods for

ϑ_{s y}

.

Table 3. Evaluation results for different predictive methods for

ϑ_{s y}

.

Methods	RMSE	MAE	$L_{2}$	$L_{\infty}$
RNN	0.1867	0.1480	3.2288	44.2432
LSTM	0.1732	0.1570	2.9943	46.9501
PPO-RL	0.2293	0.2145	3.9644	64.1299
EWMA	0.0753	0.0612	1.3020	18.3111

Table 4. Computing time analysis of train learning and test learning for

ϑ_{s y}

.

Table 4. Computing time analysis of train learning and test learning for

ϑ_{s y}

.

	EWMA	RNN	LSTM	PPO-RL
train learning (min)	-	725.28	269.74	827.56
test learning (min)	0.32	0.41	0.62	0.24

Table 5. Evaluation results for different predictive methods for

ϑ_{s t}

.

Table 5. Evaluation results for different predictive methods for

ϑ_{s t}

.

Methods	RMSE	MAE	$L_{2}$	$L_{\infty}$
RNN	0.3630	0.2843	6.2768	85.0010
LSTM	0.2674	0.2222	4.6240	66.4279
PPO-RL	0.1277	0.1097	2.2075	32.8097
EWMA	0.1160	0.0673	2.0057	20.1257

Table 6. Computing time analysis of train learning and test learning for

ϑ_{s t}

.

Table 6. Computing time analysis of train learning and test learning for

ϑ_{s t}

.

	EWMA	RNN	LSTM	PPO-RL
train learning (min)	-	779.93	248.00	830.47
test learning (min)	0.36	0.39	0.74	0.19

Table 7. Evaluation results for different predictive methods for

ϑ_{s w}

.

Table 7. Evaluation results for different predictive methods for

ϑ_{s w}

.

Methods	RMSE	MAE	$L_{2}$	$L_{\infty}$
RNN	0.2929	0.2348	5.0639	70.2100
LSTM	0.1293	0.1085	2.2364	32.4482
PPO-RL	0.1997	0.1545	3.4537	46.1984
EWMA	0.1006	0.0637	1.7404	19.0599

Table 8. Computing time analysis of train learning and test learning for

ϑ_{s t}

.

Table 8. Computing time analysis of train learning and test learning for

ϑ_{s t}

.

	EWMA	RNN	LSTM	PPO-RL
train learning (min)	-	706.23	234.53	831.98
test learning (min)	0.32	0.25	1.00	0.21

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cen, Y.; Zhang, C.; Cen, G.; Zhang, Y.; Zhao, C. The Temperature Prediction of Permanent Magnet Synchronous Machines Based on Proximal Policy Optimization. Information 2020, 11, 495. https://doi.org/10.3390/info11110495

AMA Style

Cen Y, Zhang C, Cen G, Zhang Y, Zhao C. The Temperature Prediction of Permanent Magnet Synchronous Machines Based on Proximal Policy Optimization. Information. 2020; 11(11):495. https://doi.org/10.3390/info11110495

Chicago/Turabian Style

Cen, Yuefeng, Chenguang Zhang, Gang Cen, Yulai Zhang, and Cheng Zhao. 2020. "The Temperature Prediction of Permanent Magnet Synchronous Machines Based on Proximal Policy Optimization" Information 11, no. 11: 495. https://doi.org/10.3390/info11110495

APA Style

Cen, Y., Zhang, C., Cen, G., Zhang, Y., & Zhao, C. (2020). The Temperature Prediction of Permanent Magnet Synchronous Machines Based on Proximal Policy Optimization. Information, 11(11), 495. https://doi.org/10.3390/info11110495

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Temperature Prediction of Permanent Magnet Synchronous Machines Based on Proximal Policy Optimization

Abstract

1. Introduction

2. Dataset and Correlation Analysis

2.1. Data Description

2.2. Correlation Analysis

3. The Proposed Method

3.1. Reinforcement Learning

3.2. Proximal Policy Optimization

3.3. Model Construction and Prediction

4. Experiments and Results Analysis

4.1. Experimental Environment and Parameter Definition

4.2. Model Evaluation

4.3. Experimental Results and Analysis

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI