Rotor Angle Stability Prediction Using Temporal and Topological Embedding Deep Neural Network Based On Grid-Informed Adjacency Matrix
Rotor Angle Stability Prediction Using Temporal and Topological Embedding Deep Neural Network Based On Grid-Informed Adjacency Matrix
Rotor Angle Stability Prediction Using Temporal and Topological Embedding Deep Neural Network Based On Grid-Informed Adjacency Matrix
JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, VOL. XX, NO. XX, XX XXXX 1
Abstract——Rotor angle stability (RAS) prediction is critically [3], which refers to the ability of synchronous machines of
essential for maintaining normal operation of the interconnect‐ an interconnected power system to remain in synchronism af‐
ed synchronous machines in power systems. The wide deploy‐
ter being subjected to a disturbance [4]. With the increasing
ment of phasor measurement units (PMUs) promotes the devel‐
opment of data-driven methods for RAS prediction. This paper penetration of renewable generations in modern power sys‐
proposes a temporal and topological embedding deep neural tems, the RAS requires a more accurate and fast prediction
network (TTEDNN) model to accurately and efficiently predict due to the emerging low inertia and stochastic characteris‐
RAS by extracting the temporal and topological features from tics. Currently, the prediction methods of RAS can be classi‐
the PMU data. The grid-informed adjacency matrix incorpo‐
fied into two categories, i. e., the model-driven methods and
rates the power grid structural and electrical parameter infor‐
mation. Both the small-signal RAS with disturbance on initial the data-driven methods [5].
operating conditions and the transient RAS with short circuits One of the commonly used model-driven methods is the
on transmission lines are considered. Case studies of the IEEE laborious time-domain simulation (TDS) based on high-di‐
39-bus and IEEE 300-bus power systems are used to test the mensional nonlinear differential-algebraic equations (DAEs)
performance, scalability, and robustness against measurement
that express the dynamics of power systems [6]. TDS is
uncertainties of the TTEDNN model. Results show that the
TTEDNN model performs best among existing deep learning time-consuming since it demands the whole state trajectories
models. Furthermore, the superior transfer learning ability to reveal the system stability. Although different approaches
from small-signal RAS conditions to transient RAS conditions have been proposed to accelerate the TDS process, such as
has been proved. parallel computing [6] and advanced hardware [7], huge
Index Terms— —Rotor angle stability, topological embedding, computation resources are still required to handle the increas‐
deep learning, graph convolution network. ing complexity of power systems and diverse operational sce‐
narios. The Lyapunov function family's model-driven method
is used for an analytical approach for stability assessment in
I. INTRODUCTION power systems [8]. Unfortunately, finding a Lyapunov func‐
predicting RAS, such as stacked denoising autoencoder In this paper, the temporal and topological embedding
(SDAE) [14] and long short-term memory (LSTM) network deep neural network (TTEDNN) model is proposed combin‐
[12]. The prediction results from deep learning models can ing GCN and TCN to capture the spatio-temporal features of
be used to accomplish further operational tasks such as pre‐ transient dynamics in power systems for RAS prediction.
ventive control [15]. However, since power systems are com‐ Generally, the main contributions of this paper are as fol‐
plex dynamical networks, the architectures of the above-men‐ lows:
tioned deep learning models need proper interpretability with 1) The TTEDNN model is proposed to predict RAS by
the spatial correlations of power systems. Therefore, effec‐ the temporal and spatial features extracted from the post-dis‐
tively using the important topological information of power turbed transient dynamics. The grid-informed adjacency ma‐
network structures in deep learning remains challenging. trix is used to incorporate the power grid structural and elec‐
The graph neural network (GNN) is a promising deep trical parameter information.
learning model to extract features of the spatial correlations 2) The robustness of the TTEDNN model against a differ‐
of power systems since GNN can naturally map the power ent level of measurement noise and different PMU data cy‐
network structure into its neural network connections. As cles is illustrated;
one of the GNN family, graph convolution network (GCN) In addition, the transfer learning capability of the
[16] combines topological structure with convolution algo‐ TTEDNN model is investigated. It is found that the
rithm and has been proved to be extremely powerful for the TTEDNN model trained with the small-signal perturbation
complex dynamical network analysis [17]. GCN demon‐ dataset can be used as a pre-trained model for predicting the
strates good classification and prediction capability with the transient RAS.
graph-structured data in power systems [18]. For example, The rest of this paper is organized as follows. Section II
[19] developed an interpretable GCN to guide cascading fail‐ introduces the RAS of power systems. Section III proposes
ure search efficiently. Nevertheless, the GCN could be more the architecture of the TTEDNN model. Case studies are giv‐
adept at capturing the sequential characteristics, i.e., the tem‐ en in Section IV. The conclusion remarks are drawn in Sec‐
poral information of time series of power system dynamics. tion V.
Additional techniques are needed to extract features from the
time domain of power system transient dynamics. For se‐ II. RAS OF POWER SYSTEMS
quence modeling [20], the convolutional technique has been
developed extensively in recent works and outperformed the In this section, the concept of RAS in a power system, the
baseline of well-known recurrent network architectures for RAS assessment, and disturbances imposed for the study of
sequence modeling tasks [21]. As one of the convolutional RAS are described.
technique-based recurrent architectures, temporal convolu‐ A. Concept of RAS in a Power System
tional network, also known as TCN, has been utilized for
Generally, the dynamics of a power system is governed by
time-series predictions in power systems, demonstrating pow‐
a set of differential and algebraic equations (DAEs), which
erful memory ability [22].
can be expressed in the compact form as:
Some related methods of GNN family based RAS predic‐
tion have been proposed in recent studies. Reference [23] in‐ ì ẋ = f (xyt)
í 0 = g(xyt) (1)
troduced the graph attention network (GAT) for both RAS î
and short-term voltage instability prediction [23]. Reference
where x and y denote the state and algebraic variables, re‐
[24] proposed the multi-graph attention network with residu‐
spectively; f (×) denote the dynamics of synchronous ma‐
al structure (ResGAT) for RAS assessment, which is adapted
to the power system topology changes [24]. A similar GCN chines and control systems; and g(×) denote the load flow of
architecture with a residual mechanism was designed to over‐ a power system. Given an initial condition of x and y, the so‐
come the network degradation phenomenon during model lution of (1) yields time-varying trajectories of the state vari‐
training [25]. Later, an Attention-based Hierarchical Dynam‐ ables x, i.e., the rotor angles and frequencies, and algebraic
ic grAph Pooling nEtwork (AH-DAPE) was proposed to variables y, i. e., the bus voltages and active power injec‐
make the deep learning model more robust against system- tions. The RAS of a power system is concerned with the
scale changes [26]. A multi-task recurrent graph convolution‐ ability of the interconnected synchronous machines in a pow‐
al network (RGCN) combined with LSTM was introduced er system to remain in synchronism under normal operating
for stability classification as well as critical generator identi‐ conditions and to regain synchronism after being subjected
fication [27]. However, to the best of our acknowledge, no to a small or large disturance [28]. According to the nature
existing works consider the whole categories of RAS predic‐ of stability problems, the RAS can be classified in terms of
tion, i.e., they focused on the scenario for either small-signal two subcategories: the small-signal RAS for small distur‐
RAS [25] or transient RAS [23], [24], [26], [27]. The com‐ bances and the transient RAS for large disturbances. The
prehensive prediction performance of a GNN model on both small-signal RAS depends on the initial operating state of
small-signal and transient RAS is still unclear. Meanwhile, the system [4], i.e., the initial condition of x in (1). The tran‐
few related works discussed the model robustness against sient RAS is concerned with severe disturbance such as
practical measurement uncertainty, i. e., the measured noise N - 1 contingency [4], i. e., short circuits on transmission
and sampling cycle of PMUs. lines, which can be reflected by the change of y in (1).
SUN et al.: ROTOR ANGLE STABILITY PREDICTION USING TEMPORAL AND TOPOLOGICAL EMBEDDING DEEP NEURAL NETWORK... 3
namic trajectories of rotor angles from TDS, i.e., numerical‐ êé PMU 1 (0)
ê PMU (1)i i
PMU 2 (1) PMU Ni (1) ú
ly solving the DAEs in (1). For power systems with relative‐ xi = ê
1
ú (3)
ly large scales, the TDS becomes time-consuming, and the ê ú
ê ú
demand for fast on-line RAS assessment cannot be satisfied. ë PMU 1 (l - 1) PMU 2 (l - 1) PMU N (l - 1)û
i i i
{
stability ranking [6]. The distribution of the frequency in re‐ 1 σ>0
alistic power systems demonstrates the non-Gaussian charac‐ y= (4)
0 σ⩽0
teristics of heavy tail and skewness, which can be more ac‐
curately described by the Levy-stable distribution [30]. Addi‐ where y = 1 and y = 0 correspond to the stable state and unsta‐
tionally, the maximum fluctuations of frequency should also ble state, respectively. The output of the TTEDNN model
be set to an appropriate value; otherwise, the disturbances gives the probability p that the power system will evolve to
will be too small to disturb the system or too large to be a stable or unstable state. Numerically, we take p > 0.5 for
found in realistic power systems. Normally, the disturbances the stable state and p⩽0.5 for the unstable state.
of frequency Df are bounded to ±1% to ±4% of rated fre‐ B. Structure of TTEDNN Model
quency (50 Hz or 60 Hz) [31]. Thus, we set the disturbance
limit of angular speed as Dω max = 2πDf = 10 rad/s. Consider‐ The structure of the TTEDNN model is shown in Fig. 1.
The structure has three main parts, the graph convolution
ing a power system with N nodes, we define m < N to be the
(GC) modules, temporal convolution (TC) module, and
number of nodes simultaneously disturbed, where m = 1 and
multi-layer perception (MLP) prediction layer. Each GC
m > 1 refer to the single-node disturbance case and the multi‐
ple-node disturbance case, respectively. We focuse on the dis‐ module has five neurons, where hi and h' i (i=a to e) repre‐
turbances of small-signal RAS on rotor angle and angular sent the input states of the five neurons in each GC module,
speed [ δ i ω i ] , i = 12N. For the transient RAS, power respectively. The operation of the GC module is to update
the output state of the focused neurons (red) using the adja‐
systems are subjected to more severe disturbances, i. e., the
cency matrix, which is only relevant to the neurons adjacent
N - 1 contingency. We consider scenarios by triggering short
to (blue) rather than others (gray). The topological features
circuits of transmission lines and predict the transient RAS
are then extracted by the fully connected (FC) layer and fur‐
at post-fault stages. The more severe disturbances, such as
ther processed by the TC module, composed of R residual
N - s contingencies, s⩾2, are rare events in power systems
blocks with dilated factors d 1 to d R. In the end, the MLP lay‐
[32] and are not taken into account in this paper.
er generates the prediction probability function.
1) GC Modules
III. ARCHITECTURE OF TTEDNN MODEL The TTEDNN model starts with n GC modules to extract
The TTEDNN model is proposed to predict the RAS in topological features. Each GC module is sequentially com‐
power systems by extracting the temporal and topological posed of a GCN layer, a batch normalization (BN) layer,
features embedded in the time series data of PMUs. and a rectified linear unit (ReLU) activation function. The
4 JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, VOL. XX, NO. XX, XX XXXX
structure of the GCN layer can be represented as an undirect‐ defined as a dilated transformation of a 1D time series da‐
ed graph [16] G = (VEB), where V Î RN is the set of neu‐ ta x:
rons, E Î RE is the set of links between neurons, and
F( j) = ∑f (i)x j - d × i
k-1
ìï1 (ij)Î E pnorm or i = j the single-node and multiple-node disturbance cases (m > 1).
B 1ij = í (9) Given a power system with N nodes, procedures for gener‐
ïî 0 (ij)Ï E pnorm or (ij)Î E pcon
ating the dataset under the single-node disturbance case are
ìï K ij sin(δ i - δ j ) (ij)Î E pnorm described as follows.
B 2ij = í (10) 1) Solve the power flow and let the solution be the undis‐
ïî 0 (ij)Ï E pnorm or (ij)Î E pcon turbed initial state.
2) The undisturbed initial state for each node i = 12N
ì K ij (ij)Î E pnorm is randomly disturbed K i times individually according to the
ïï
B 3ij = í 0 (ij)Ï E pnorm or (ij)Î E pcon (11) distribution of frequency fluctuations.
ï 3) For each disturbed initial state, conduct TDS and use
ïî P i i = j
the resulting trajectories to label its TSI.
where E pnorm and E pcon are the normal and faulty transmission For each sample in the dataset under the multiple-node
line sets under contingencies, respectively; ij denotes the( ) disturbance case, m > 1 different nodes are simultaneously
transmission line between node i and j; K ij is the maximum disturbed. The corresponding data generation processes are
transmission capability of the transmission line (ij); and P i as follows.
( )
is the active power injection of node i. If ij Ï E pnorm E pcon, 1) Solve the power flow and let the solution be the undis‐
turbed initial state.
no transmission line exists between node i and node j. If
( ) ( )
ij Î E pcon, ij is a faulty transmission line during the con‐ 2) Randomly select M groups of nodes, and each group in‐
cludes m nodes.
tingency.
3) Within each group of nodes, the undisturbed initial
2) Class-weighted loss function
states of m nodes are randomly disturbed K m times simulta‐
For training the TTEDNN model, the class-weighted bina‐
neously according to the distribution of frequency fluctua‐
ry cross entropy (BCE) is used as the loss function Loss
tions.
with the L 2 regularization:
Loss = ∑(α 1 y i log 2 p i + α 0 (1 - y i )log 2 (1 - p i )) +
4) For each group of disturbed initial states, conduct TDS
and use the resulting trajectories to label its TSI.
i
(12) For the single-node disturbance dataset of the IEEE 39-
1
βå ( w k 2 + b k 2 ) bus power system, given K i = 1000, 39000 samples in total
2 are generated with 33004 samples of stable states and 5996
where y i and p i denote the label and the model output of the samples of unstable states . For the single-node disturbance
ith sample, respectively; α 0 and α 1 denote the weight factors dataset of the IEEE 300-bus system, given K i = 441, 52038
corresponding to the stable state and unstable state, respec‐ samples in total are generated with 48186 samples of stable
tively; w k and b k are the learnable network parameters and β states and 3852 samples of unstable states. For the multiple-
is the regularization weight. Class-weighted BCE is proved node disturbance dataset of the IEEE 39-bus power system,
significantly helpful for the training dataset with the great given m = 3, M = 60, and K m = 200, 12000 samples in total
imbalance. In the training dataset for RAS prediction, there are generated with 7377 samples of stable states and 4623
are fewer samples concerning unstable states. The imbalance samples of unstable states. For the multiple-node disturbance
of the dataset results from the fact that practical power sys‐ dataset of the IEEE 300-bus power system, given m = 3, M =
tems are stable most of the time under common disturbances 60, and K m = 200, 12000 samples in total are generated with
(see the disturbance discussed in Section II-C). 11132 samples of stable states and 868 samples of unstable
states. The single-node disturbance dataset is used for the
IV. CASE STUDY training of the TTEDNN model, and 60%, 20%, and 20% of
In this section, the IEEE 39-bus and IEEE 300-bus power the dataset are used for training, validation, and testing, re‐
systems are used to test the performance, scalability, effect spectively. The model trained with the single-node distur‐
of PMU data cycles, and robustness against measurement bance dataset is directly used for predicting RAS under
noise of the TTEDNN model. Furthermore, the transfer multi-node disturbance. Therefore, 100% of the multi-node
learning ability of the TTEDNN model trained on the small- disturbance dataset is used for testing. We have two test data‐
signal RAS dataset to predict the transient RAS is discussed. sets, one for predicting the RAS under single-node distur‐
bance, and the other for predicting the RAS under multi-
A. Training Setup node disturbance.
The specific parameters of IEEE 39-bus and IEEE 300- The dataset of N - 1 contingencies for the transient RAS
bus power systems for the evaluation and scalability valida‐ prediction is generated asfollows.
tion of the TTEDNN model are derived from the PST tool‐ 1) Randomly change all loads from 80% to 120% at the
box [35] and Matpower 6.0 toolbox [36]. Following the dis‐ basic load levels.
turbance discussed in Section II-C, disturbances on the ini‐ 2) Solve power flow and let the solution be undisturbed
tial states of rotor angle and angular speed [ δ i ω i ] are con‐ initial state.
sidered for the small-signal RAS. The training dataset con‐ 3) Conduct the TDS based on the undisturbed initial state
tains the set under the case of the single-node disturbances and trigger a three-phase short-circuit fault of a randomly se‐
(m = 1), while the test dataset consists of the set under both lected transmission line, and clear the fault after 0.1 s.
6 JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, VOL. XX, NO. XX, XX XXXX
4) Label the TSI with the post-fault state. can be found that ACC increases sharply to 98% within 20
Consequently, for the IEEE 39-bus system, 28328 samples epochs, and the training of the TTEDNN model converges
are generated with 20986 samples of stable states and 7342 quickly and smoothly after nearly 150 epochs.
samples of unstable states. For the IEEE 300-bus system,
30850 samples are generated with 21808 samples of stable
states and 9042 samples of unstable states.
The confusion matrix is helpful for the evaluation of the
prediction model, which defines four values based on actual
and predicted results, i. e., TP, FP, TN, and FN, where TP
(TN) is the extent to which the model correctly predicts the
positive (negative) class, and FP (FN) is the extent to which
the model wrongly predicts the negative (positive) class. In
this paper, the stable/positive and unstable/negative are inter‐
changeable. Four metrics including accuracy ACC, false posi‐
tive rate FPR, false negative rate FNR, and F-score Fscore are Fig. 2. Validation performance of trained TTEDNN model in terms of
ACC and class-weighted loss at different training epochs for small-signal
used to measure the performance of the TTEDNN model.
RAS prediction in both IEEE 39-bus and IEEE 300-bus power systems.
TP + TN
ACC = (13)
TP + TN + FP + FN
Table I shows the performance metrics for small-signal
FP RAS prediction under the single-node disturbance dataset in
FPR = (14)
FP + TN the IEEE 39-bus and IEEE 300-bus power systems.
FN
FNR = (15) TABLE I
FN + TP PERFORMANCE METRICS FOR SMALL-SIGNAL RAS PREDICTION UNDER
P recision R ecall SINGLE-NODE DISTURBANCE DATASET IN IEEE 39-BUS AND
F-score F score = (1 + γ )
2
(16) IEEE 300-BUS POWER SYSTEMS
γ 2 P recision + R ecall
where P recision = TP (TP + FP ) denotes the fraction of TP IEEE 39-bus system IEEE 300-bus system
among those the model classified as positive class; and Model ACC FNR FPR ACC FNR FPR
Fscore Fscore
(%) (%) (%) (%) (%) (%)
R ecall = TP (TP + FN) denotes the fraction of TP among the to‐
SVM 84.27 10.36 12.17 0.8921 87.43 8.98 13.24 0.9135
tal number of positive samples. While ACC, FPR, and FNR
MLP 98.45 0.97 7.24 0.9930 99.69 0.14 6.64 0.9845
can reveal whether the predictions are good or not, Fscore
CNN 98.36 0.86 9.33 0.9907 99.62 0.23 5.78 0.9928
could evaluate the prediction of the model of imbalanced
samples more comprehensively for it indicates how much LSTM 96.19 3.33 18.53 0.9558 99.25 0.18 2.15 0.9978
more important recall is than precision or vice-versa. We set GCN 96.19 3.33 18.53 0.9558 99.25 0.18 2.15 0.9978
γ = 1 in this paper. RGCN 98.15 2.20 8.91 0.9852 99.53 0.18 6.14 0.9921
The TTEDNN model is based on Tensorflow 2.3.1 and de‐ Proposed 99.63 0.29 0.47 0.9965 99.88 0.17 0.00 0.9989
ployed on a server with Intel(R) Xeon(R) CPU E5-2620 v3.
Two groups of GC modules (n = 2F) with the kernel size of Six existing models including support vector machine
16 and 8 are used to extract topological features from PMU (SVM), MLP, CNN [13], LSTM, GCN [19], and RGCN
data input. The TC module has five RBs (R = 5) and expo‐ [27] are used to compare the performance metrics with the
nential dilated factors d = 2r - 1 for r = 12...R with kernel proposed TTEDNN model. It can be observed from Table I
size k=2 and the number of filters is 32. The MLP predic‐ that the TTEDNN model outperforms the compared deep
tion layer has the dimensions of (16, 1) and (32, 1) for the learning models under almost all performance metrics. Spe‐
input layer and the hidden layer, respectively. The learning cifically, the TTEDNN model has the best performance in
rate and batch size for the training are set to be 10-3 and terms of ACC of 99.63% and Fscore of 0.9965 for the IEEE
128. L 2 regularization weight β is set to be 5 ´ 10-4. Weight 39-bus power system, and ACC of 99.88% and Fscore of
factor α 0 is set to 1, and α 1 is calculated on each batch as 0.9989 for the IEEE 300-bus power system. The correct pre‐
∑y ∑y
ì 256 256 diction of unstable states is critically important in the practi‐
ï 256 i -1 i ¹0 cal implementation, which can be reflected by the FPR, the
ï i=1 i=1
α1 = í (17) proportion of the fault prediction in all unstable samples.
∑y
256
ïï The TTEDNN model has the best FPR of only 0.47%, i. e.,
ï0 i =0
î i=1 among all the unstable samples, only six samples are mistak‐
enly predicted to be stable. The MLP has the best FPR of
B. Small-signal RAS Prediction 0.14% under the single-node test dataset of the IEEE 300-
For small-signal RAS prediction, Fig. 2 shows the valida‐ bus power system, slightly better than that of the TTEDNN
tion performance of the trained TTEDNN model in terms of model with 0.17%.
ACC and class-weighted loss at different training epochs in The performance metrics for small-signal RAS prediction
both the IEEE 39-bus and IEEE 300-bus power systems. It under the multiple-node disturbance dataset in the IEEE 39-
SUN et al.: ROTOR ANGLE STABILITY PREDICTION USING TEMPORAL AND TOPOLOGICAL EMBEDDING DEEP NEURAL NETWORK... 7
bus and IEEE 300-bus power systems are also investigated, TTEDNN model under disturbances of N - 1 contingencies
since the multiple-node disturbances are more likely to hap‐ in the IEEE 39-bus and IEEE 300-bus power systems, as
pen in reality and will make the prediction task more compli‐ shown in Table III and Table IV, respectively. It can be ob‐
cated. The same six existing models shown in Table I are served that the proposed TTEDNN model outperforms all
used to compare the performance metrics with the proposed compared existing models for each performance metric. Spe‐
TTEDNN model. As shown in Table II, the TTEDNN model cifically, the proposed TTEDNN model obtains ACC of
has the best performance on predicting the small-signal RAS 99.63% and Fscore of 0.9964 for the IEEE 39-bus power sys‐
under multiple-node disturbances, i. e., ACC of 98.60% and tem, and ACC of 99.72% and Fscore of 0.9973 for the IEEE
Fscore of 0.9862 for the IEEE 39-bus power system and ACC 300-bus power system.
of 97.80% and Fscore of 0.9785 for the IEEE 300-bus power
TABLE III
system. It is worth noting that the compared existing models
PERFORMANCE METRICS FOR TRANSIENT RAS PREDICTION IN
show a 3%-18% drop in terms of ACC and Fscore when the IEEE 39-BUS POWER SYSTEM
condition changes from single-node disturbances to multiple-
node disturbances, while the proposed TTEDNN model only Model ACC (%) FNR (%) FPR (%) Fscore
has very small changes. Hence, the TTEDNN model is more SVM 96.21 2.12 3.21 0.9624
robust than the existing models for the scenario where the MLP 99.15 0.41 1.32 0.9919
system is subjected to multiple-node disturbances. CNN 98.87 0.75 1.54 0.9892
TABLE II LSTM 99.15 0.58 1.14 0.9919
PERFORMANCE METRICS FOR SMALL-SIGNAL RAS PREDICTION UNDER GCN 98.20 1.19 2.46 0.9828
MULTIPLE-NODE DISTURBANCE DATASET IN IEEE 39-BUS AND
RGCN 99.28 0.58 0.88 0.9930
IEEE 300-BUS POWER SYSTEMS
Proposed 99.63 0.34 0.40 0.9964
IEEE 39-bus system IEEE 300-bus system
Method TABLE IV
ACC FNR FPR ACC FNR FPR
Fscore Fscore PERFORMANCE METRICS FOR TRANSIENT RAS PREDICTION IN
(%) (%) (%) (%) (%) (%)
IEEE 300-BUS POWER SYSTEM
SVM 81.27 15.56 18.29 0.8130 86.96 10.92 16.43 0.8695
MLP 82.49 16.16 21.27 0.8250 90.46 2.44 20.86 0.9048
Model ACC (%) FNR (%) FPR (%) Fscore
CNN 80.73 11.93 39.68 0.8072 95.29 0.41 11.57 0.9530
SVM 97.24 2.04 4.48 0.9725
LSTM 82.49 16.16 21.27 0.8251 90.71 2.37 20.32 0.9072
MLP 98.98 0.80 1.45 0.9918
GCN 93.21 2.31 10.45 0.9321 96.78 1.90 4.86 0.9710
CNN 98.33 1.44 2.21 0.9835
RGCN 90.26 11.66 4.41 0.8999 97.36 0.52 10.34 0.9742
LSTM 99.19 0.46 1.66 0.9920
Proposed 98.60 0.98 2.56 0.9862 97.80 0.68 5.95 0.9785
GCN 98.75 1.12 1.55 0.9876
RGCN 99.42 0.34 1.16 0.9941
C. Transient RAS Prediction Proposed 99.72 0.23 0.39 0.9973
The TTEDNN model also is trained for transient RAS pre‐
diction, the dataset of which is generated under disturbances Meanwhile, the advanced prediction performances on
of N - 1 contingencies. The validation performance of the IEEE 300-bus power system under both the small-signal
trained TTEDNN model in terms of ACC and class-weighted RAS and transient RAS also demonstrate the scalability of
loss for the transient RAS at different training epochs in the proposed TTEDNN model to apply relatively large pow‐
both the IEEE 39-bus and IEEE 300-bus power systems are er systems. The time for predicting the RAS is also evaluat‐
shown in Fig. 3. Both the ACC and class-weighted loss con‐ ed, which is important for fast online implement. Based on
verge successfully after 200 epochs. The ACC increases to the Intel(R) Xeon(R) CPU E5-2620 v3, it takes approximately
approximately 98% after 100 epochs. 5 ms for the trained TTEDNN model to predict both the
The same six existing models shown in Table I are used small-signal RAS and transient RAS per batch, which is
to compare the performance metrics with the proposed much faster than the traditional TDS.
D. Effect of PMU Data Cycles
The observation window length of post-fault PMU data af‐
fects the ACC and computational training time of the pro‐
posed TTEDNN model. Longer observation window length
provides more information about the system dynamics that
can increase the prediction performance, as shown in Table
V, while longer computational training time is required. The
observation window length is also called response time and
is often measured by the unit of cycles [37]. With different
Fig. 3. Validation performance of trained TTEDNN model in terms of
ACC and class-weighted loss for transient RAS at different training epochs cycles, the trade-off between ACC and computational train‐
in both IEEE 39-bus and 300-bus power systems. ing time per batch for the proposed TTEDNN model trained
8 JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, VOL. XX, NO. XX, XX XXXX
on the IEEE 39-bus power system is illustrated in Fig. 4. ization in Fig. 5(a), which indicates that although B 2 con‐
Both ACC of the small-signal RAS and transient RAS sce‐ tains the information of active power flow distribution, other
narios increase to the maximum at 5 cycles while the compu‐ useful information about the power system topology and
tational time monotonically increases. Thus, 5 cycles are cho‐ electrical properties are discarded. The performance measure‐
sen to be the optimal length of observation window length ments of B 1 and B 3 are almost the same, while B 3 is slightly
for the TTEDNN model, i.e., only first 5 cycles of post-fault better on true positive rate (TPR). Hence, the grid-informed
PMU data are needed to achieve the highest ACC with the adjacency matrix B 3 is used for the TTEDNN model to pre‐
shortest computational training time. The existing work [38] dict the small-signal RAS.
shows the cycles of PMU data observed are longer versus
the average response time of around 1.5 cycles by time-adap‐
tive methods. Nevertheless, 5 cycles are acceptable for the
RAS prediction task for the following reasons. The control
actions will not be executed until a waiting time of 0.15 s to
0.4 s after the fault is cleared, which is still much longer
than the first 5 cycles of the post-fault PMU data. Addition‐
ally, the superior prediction performances, i. e., the high
ACC, and FPR indicate that the TTEDNN model could be
more robust in the unstable sample prediction and can help
the control actions than the time-adaptive methods.
TABLE V
TRANSIENT RAS PREDICTION WITH DIFFERENT CYCLES OF POST-FAULT
PMU DATA UNDER IEEE 39-BUS POWER SYSTEM
diction method [39]. The noise in PMU data has a standard few-shot learning of transient RAS prediction.
deviation ranging from 0.0005 to 0.01 [40], resulting a typi‐
cal signal-to-noise rate (SNR) of 45 dB. Table VII exhibits
the performance on both the small-signal and transient RAS
predictions under a certain value of noises. The best perfor‐
mance is realized in the ideal environment without noise.
When SNR reduces to 40 dB (lower than the typical SNR),
the performance still maintains at a high level, i. e., only
0.1% and 0.25% decreases of ACC for the small-signal and
transient RAS predictions, respectively. For strong noise lev‐
els with SNR of only 20 dB, the prediction performance de‐
grades slightly, i.e., 0.78% and 1.20% drops of ACC for the Fig. 6. Comparison of ACC between TTEDNN model and RGCN model
small-signal and transient RAS predictions, respectively. Be‐ under different SNR levels.
sides the ACC, other performance metrics also demonstrate
only slight degrades with the decreasing SNR. Hence, the
To investigate the transfer learning ability of pre-trained
TTEDNN model is robust against the noise in PMU data.
TTEDNN on the small-signal RAS dataset, three re-training
TABLE VII tests are introduced and compared: ① training from scratch
PREDICTION PERFORMANCE FOR IEEE-39 BUS POWER SYSTEM UNDER DIF‐ (TFS): the whole network parameters are updated without
FERENT SNR LEVELS OF PMU DATA
pre-trained initialization; ② full fine-tuning (FT): the whole
network parameters are updated with pre-trained initializa‐
tion; ③ local fine-tuning (LFT): only the layers close to the
RAS prediction SNR (dB) ACC (%) FNR (%) FPR (%) Fscore
No 99.63 0.29 0.47 0.9965
output are updated with pre-trained initialization.
60 99.65 0.29 0.41 0.9968 The TFS test updates all the parameters of GC modules,
50 99.59 0.33 0.50 0.9962 TC modules, and MLP layer of the TTEDNN model with a
Small-signal
40 99.53 0.38 0.58 0.9926 random weight initializer. The FT test updates all the param‐
30 99.31 0.60 0.80 0.9916 eters of the TTEDNN model with the pre-trained model with
20 98.85 0.91 1.44 0.9893 the small-signal RAS dataset. For the LFT test, part of the
No 99.63 0.34 0.40 0.9964 GC modules is frozen to keep the ability of topological fea‐
60 99.61 0.34 0.44 0.9963 ture extraction and update the parameters of the TC module
50 99.58 0.37 0.51 0.9958 and MLP layer. The performance comparison of the three re-
Transient training tests is given as follows.
40 99.38 0.54 0.70 0.9941
30 99.10 1.02 0.77 0.9913 Figure 7 shows the validation performance in terms of
20 98.43 1.42 1.72 0.9849 ACC and loss during the training process of three re-training
tests. The transfer learning with pre-trained initialization is
proved to be effective, i. e., the losses of FT and LFT
The performance of the proposed TTEDNN model and smoothly converge to 1.5 times smaller values than that of
the RGCN model are compared under different noise levels the TFS. Moreover, it can be obesrved from Fig. 7(b) that
in terms of the SNR. Six existing models shown in Table III the FT and LFT enable faster early-stop with a given accept‐
are used to compare the performance of transient RAS pre‐ able performance so that the training cost is reduced, i.e., on‐
diction, and the RGCN model has the best performance ly 30 to 70 epochs are needed for FT and LFT to reach
among them. Figure 6 shows the comparison of ACC be‐ 99.5% of ACC while more than 200 epochs for the TFS to
tween the proposed TTEDNN model and the RGCN model reach the same ACC.
under different SNR levels. It can be observed that as the Table VIII shows the performance metrics for three re-
SNR decreases, ACC of the TTEDNN model decreases slow‐ training tests in the IEEE 39-bus power system. It is worth
er than that of the RGCN model. Specifically, when SNR is noticing that the performance metrics of transient RAS in
20 dB, ACC of the RGCN model decreases by 2.31%, which the model pre-trained with the small-signal RAS dataset is
is twice as much as 1.20% of the TTEDNN model. even better than those in the model directly trained with the
transient RAS dataset, which indicates that small-signal RAS
G. Transfer Learning Ability dataset provides useful information for the transient RAS
The transfer learning ability of the proposed TTEDNN prediction.
model trained on the small-signal RAS dataset to predict the The time consumption of the three re-training tests in the
transient RAS is worthful to be investigated. Usually, small IEEE 39-bus power system is listed in Table IX. We can no‐
disturbances happen more commonly in real power systems tice that with the same data generation time, the FT reduces
than serve N - 1 contingencies. Hence, the dataset for small- 5042 s training time compared with TFS, and the LFT fur‐
signal RAS is easier to be collected. The small-signal RAS ther reduces about 350 s compared with FT due to part of
dataset can provide certain information on stable and unsta‐ the layers do not need to be updated.
ble patterns for the transient RAS prediction task. Learning To explain the mechanism of LFT for transfer learning
based on the small-signal RAS dataset can be useful for the more intuitively, the outputs of the hidden layer in the
10 JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, VOL. XX, NO. XX, XX XXXX
TABLE VIII
PERFORMANCE METRICS FOR THREE RE-TRAINING TESTS IN IEEE 39-BUS
POWER SYSTEM
TABLE IX
COMPARISON OF TIME CONSUMPTION OF THREE RE-TRAINING TESTS IN
IEEE 39-BUS SYSTEM
Fig. 8. Visualizations of high-dimensional activations from hidden layer in TTEDNN model. (a) Visualization of TFS at epoch 0. (b) Visualization of TFS
at epoch 20. (c) Visualization of TFS at epoch 40. (d) Visualization of TFS at epoch 60. (e) Visualization of LFT at epoch 0. (f) Visualization of LFT at ep‐
och 20. (g) Visualization of LFT at epoch 40. (h) Visualization of LFT at epoch 60.
V. CONCLUDE tures from the PMU data with TC modules. The TTEDNN
We proposed the TTEDNN model for small-signal and model has the following advantages. First, it shows the best
transient RAS prediction in power systems. The TTEDNN prediction performance compared with existing deep learn‐
model maps the spatial information of power system topolo‐ ing models under both small disturbances and N - 1 contin‐
gy into the GC modules as well as extracts the temporal fea‐ gencies. Second, it can make a fast prediction with only the
SUN et al.: ROTOR ANGLE STABILITY PREDICTION USING TEMPORAL AND TOPOLOGICAL EMBEDDING DEEP NEURAL NETWORK... 11
PMU data of the post-disturbed first five cycles, demonstrat‐ tion of generic convolutional and recurrent networks for sequence
modeling. [EB/Online]. Available: https://arxiv.org/abs/1803.01271
ing its potential for online implementation. Third, it is robust [21] A. van den Oord, S. Dieleman, H. Zen et al. (2016, Sept.). Wavenet: a
against the measurement noise of PMU data, which is neces‐ generative model for raw audio. [EB/Online]. Available: https://arxiv.
sary for practical applications. Finally, it provides the superi‐ org/abs/1609.03499
[22] X. Tang, H. Chen, W. Xiang et al., “Short-term load forecasting using
or transfer learning ability from small-signal RAS conditions channel and temporal attention based temporal convolutional net‐
to transient RAS conditions. work,” Electric Power Systems Research, vol. 205, p. 107761, Apr.
2022.
[23] R. Zhang, W. Yao, Z. Shi et al., “A graph attention networks-based
REFERENCES model to distinguish the transient rotor angle instability and short-term
[1] P. Sarajcev, A. Kunac, G. Petrovic et al., “Power system transient sta‐ voltage instability in power systems,” International Journal of Electri‐
bility assessment using stacked autoencoder and voting ensemble,” En‐ cal Power & Energy Systems, vol. 137, p. 107783, May 2022.
ergies, vol. 14, no. 11, p. 3148, May 2021. [24] J. Huang, L. Guan, Y. Su et al., “A topology adaptive high-speed tran‐
[2] J. Hou, C. Xie, T. Wang et al., “Power system transient stability as‐ sient stability assessment scheme based on multi-graph attention net‐
sessment based on voltage phasor and convolution neural network,” in work with residual structure,” International Journal of Electrical Pow‐
Proceeding of 2018 IEEE International Conference on Energy Internet er & Energy Systems, vol. 130, p. 106948, Sept. 2021.
(ICEI), Beijing, China, May 2018, pp. 247-251. [25] Y. Su, M. Guo, H. Yao et al., “Power system small-signal stability as‐
[3] S. K. Azman, Y. J. Isbeih, M. S. E. Moursi et al., “A unified online sessment model based on residual graph convolutional networks,”
deep learning prediction model for small signal and transient stabili‐ Journal of Physics: Conference Series, vol. 2095, p. 012011, Sp. 2021.
ty,” IEEE Transactions on Power Systems, vol. 35, no. 6, pp. 4585- [26] J. Huang, L. Guan, Y. Chen et al., “A deep learning scheme for tran‐
4598, Nov. 2020. sient stability assessment in power system with a hierarchical dynamic
[4] P. Kundur, J. Paserba, V. Ajjarapu et al., “Definition and classification graph pooling method,” International Journal of Electrical Power &
of power system stability IEEE/CIGRE joint task force on stability Energy Systems, vol. 141, p. 108044, Oct. 2022.
terms and definitions,” IEEE Transactions on Power Systems, vol. 19, [27] J. Huang, L. Guan, Y. Su et al., “Recurrent graph convolutional net‐
no. 3, pp. 1387-1401, Aug. 2004. work-based multi-task transient stability assessment framework in pow‐
[5] S. Zhang, Z. Zhu, and Y. Li, “A critical review of data-driven tran‐ er system,” IEEE Access, vol. 8, pp. 93283-93296, Apr. 2020.
sient stability assessment of power systems: principles, prospects and [28] N. Hatziargyriou, J. Milanovic, C. Rahmann et al., “Definition and
challenges,” Energies, vol. 14, no. 21, p. 7238, Nov. 2021. classification of power system stability — revisited & extended,”
[6] Z. Liu, X. He, Z. Ding et al., “A basin stability based metric for rank‐ IEEE Transactions on Power Systems, vol. 36, no. 4, pp. 3271-3281,
ing the transient stability of generators,” IEEE Transactions on Indus‐ July 2021.
trial Informatics, vol. 15, no. 3, pp. 1450-1459, Mar. 2019. [29] A. Gupta, G. Gurrala, and P. S. Sastry, “An online power system sta‐
[7] B. Zhang, X. Jin, S. Tu et al., “A new FPGA-based real-time digital bility monitoring system using convolutional neural networks,” IEEE
solver for power system simulation,” Energies, vol. 12, no. 24, p. Transactions on Power Systems, vol. 34, no. 2, pp. 864-872, Mar.
4666, Dec. 2019. 2019.
[8] T. L. Vu and K. Turitsyn, “Lyapunov functions family approach to [30] B. Schäfer, C. Beck, K. Aihara et al., “Non-gaussian power grid fre‐
transient stability assessment,” IEEE Transactions on Power Systems, quency fluctuations characterized by Lévy-stable laws and superstatis‐
vol. 31, no. 2, pp. 1269-1277, Mar. 2016. tics,” Nature Energy, vol. 3, pp. 119-126, Jan. 2018.
[9] M. Anghel, F. Milano, and A. Papachristodoulou, “Algorithmic con‐ [31] J. Dong, J. Zuo, L. Wang et al., “Analysis of power system disturbanc‐
struction of Lyapunov functions for power system stability analysis,” es based on wide-area frequency measurements,” in Proceedings of
IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 2007 IEEE PES General Meeting, Tampa, USA, Jul. 2007, pp. 1-8.
60, no. 9, pp. 2533-2546, Sept. 2013. [32] Q. Chen, The Probability, Identification, and Prevention of Rare
[10] S. Wei, M. Yang, J. Qi et al., “Model-free mle estimation for online Events in Power Systems. Ames: Iowa State University, 2004.
rotor angle stability assessment with PMU data,” IEEE Transactions [33] J. Zhao, M. Netto, and L. Mili, “A robust iterated extended Kalman
on Power Systems, vol. 33, no. 3, pp. 2463-2476, May 2018. filter for power system dynamic state estimation,” IEEE Transactions
[11] R. Yan, G. Geng, Q. Jiang et al., “Fast transient stability batch assess‐ on Power Systems, vol. 32, no. 4, pp. 3205-3216, July 2017.
ment using cascaded convolutional neural networks,” IEEE Transac‐ [34] E. Ghahremani and I. Kamwa, “Local and wide-area pmu-based decen‐
tions on Power Systems, vol. 34, no. 4, pp. 2802-2813, July 2019. tralized dynamic state estimation in multi-machine power systems,”
[12] J. J. Q. Yu, D. J. Hill, A. Y. S. Lam et al., “Intelligent time-adaptive IEEE Transactions on Power Systems, vol. 31, no. 1, pp. 547-562,
transient stability assessment system,” IEEE Transactions on Power Jan. 2016.
Systems, vol. 33, no. 1, pp. 1049-1058, Jan. 2018. [35] J. Chow and K. Cheung, “A toolbox for power system dynamics and
[13] M. Sahu and R. Dash, “A survey on deep learning: convolution neural control engineering education and research,” IEEE Transactions on
network (CNN),” in Intelligent and Cloud Computing, Berlin: Spring‐ Power Systems, vol. 7, no. 4, pp. 1559-1564, Nov. 1992.
er, Jan. 2021, pp. 317-325. [36] R. D. Zimmerman, C. E. Murillo-Sánchez, and R. J. Thomas, “Mat‐
[14] T. Su, Y. Liu, J. Zhao et al., “Probabilistic stacked denoising autoen‐ power: steady-state operations, planning, and analysis tools for power
coder for power system transient stability prediction with wind systems research and education,” IEEE Transactions on Power Sys‐
farms,” IEEE Transactions on Power Systems, vol. 36, no. 4, pp. 3786- tems, vol. 26, no. 1, pp. 12-19, Feb. 2011.
3789, Jully 2021. [37] R. Zhang, Y. Xu, Z. Y. Dong et al., “Post-disturbance transient stabili‐
[15] T. Su, Y. Liu, J. Zhao et al., “Deep belief network enabled surrogate ty assessment of power systems by a self-adaptive intelligent system,”
modeling for fast preventive control of power system transient stabili‐ IET Generation, Transmission & Distribution, vol. 9, no. 3, pp. 296-
ty,” IEEE Transactions on Industrial Informatics, vol. 18, no. 1, pp. 305, Feb. 2015.
315-326, Jan. 2022. [38] H. Wang and S. Wu, “Transient stability assessment with time-adap‐
[16] T. N. Kipf and M. Welling, “Semi-supervised classification with graph tive method based on spatial distribution,” International Journal of
convolutional networks,” in Proceedings of International Conference Electrical Power & Energy Systems, vol. 143, p. 108464, Dec. 2022.
on Learning Representations, Toulon, France, Feb. 2017, pp. 1-9. [39] Z. Rafique, H. M. Khalid, S. Muyeen et al., “Bibliographic review on
[17] B. Yu, H. Yin, and Z. Zhu, “Spatio-temporal graph convolutional net‐ power system oscillations damping: an era of conventional grids and
works: a deep learning framework for traffic forecasting,” in Proceed‐ renewable energy integration,” International Journal of Electrical
ings of the 27th International Joint Conference on Artificial Intelli‐ Power & Energy Systems, vol. 136, p. 107556, Mar. 2022.
gence, Stockholm, Sweden, Jul. 2018 pp. 3634-3640. [40] T. Chen, H. Ren, Y. Sun et al., “Optimal placement of phasor measure‐
[18] Z. Wu, S. Pan, F. Chen et al., “A comprehensive survey on graph neu‐ ment unit in smart grids considering multiple constraints,” Journal of
ral networks,” IEEE Transactions on Neural Networks and Learning Modern Power Systems and Clean Energy, vol. 11, no. 2, pp. 479-488,
Systems, vol. 32, no. 1, pp. 4-24, Jan. 2021. Mar. 2023.
[19] Y. Liu, N. Zhang, D. Wu et al., “Searching for critical power system [41] L. van der Maaten and G. Hinton, “Visualizing data using t-SNE,”
cascading failures with graph convolutional network,” IEEE Transac‐ Journal of Machine Learning Research, vol. 9, no. 86, pp. 2579-2605,
tions on Control of Network Systems, vol. 8, no. 3, pp. 1304-1313, 2008.
Sept. 2021.
[20] S. Bai, J. Z. Kolter, and V. Koltun. (2018, Apr.). An empirical evalua‐
12 JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, VOL. XX, NO. XX, XX XXXX
Peiyuan Sun received his B.S. degree in electrical engineering and automa‐ University of British Columbia, Vancouver, Canada , and Yale University,
tion from the Xi'an Jiaotong University, Xi'an, China, in 2022. He is current‐ New Haven, USA in 2003 and 2008, respectively. He worked as a Postdoc
ly pursuing his M. S. degree in electrical engineering at the Xi'an Jiaotong Fellow in the Research Laboratory of Electronics, Massachusetts Institute of
University. His research interests include power system stability assessment Technology, Cambridge, USA from 2008 to 2014. He is currently working
and control and deep learning in power systems. as an Associate Professor in the School of Electrical Engineering at Xi'an Ji‐
aotong University, Xi'an, China. His research interests include artificial intel‐
Long Huo received his B.Sc degree at the College of Electrical and Infor‐ ligence, feature engineering, deep reinforcement learning, stability analysis
mation Engineering at Hunan University, Changsha, China, in 2015. He of complex dynamical systems, cascading failure in cyber-physical power
completed his M.Sc degree at the School of Automation and Information En‐ systems, and power system dynamics and control.
gineering at Xi'an University of Technology, Xi'an, China, in 2018. He is
currently pursuing his Ph.D. degree at the School of Electrical Engineering Siyuan Liang received his B.E. degree in electrical engineering & automa‐
at Xi'an Jiaotong University. His research interests include cascading fail‐ tion from Xi'an Jiaotong University, Xi'an, China, in 2022. He is currently
ures and frequency synchronization stability in power grids. pursuing his Ph.D. degree in computer science and engineering at the Chi‐
nese University of Hong Kong, Hong Kong, China. His research interests in‐
Xin Chen received his B.Sc. degree from Nanjing University, Nanjing, Chi‐ clude machine learning-based control solutions for power system dynamics,
na, in 1999, and his M.Sc. and Ph.D. degrees in Chemical Physics from the IC packaging technologies, and electronic design automation algorithms.