0% found this document useful (0 votes)
47 views16 pages

Artificial Neural Networks in Structural Dynamics: A New Modular Radial Basis Function Approach vs. Convolutional and Feedforward Topologies

This document describes a study that develops and compares three types of artificial neural networks (ANNs) to predict structural deformations measured in experiments: a feedforward neural network (FFNN), a new modular radial basis function neural network (RBFNN), and a deep convolutional neural network (DCNN). The networks are trained using measurements from shock tube experiments on circular metal plates under high-speed loading and tested to determine which network type most accurately predicts measured plate deformations and vibrations. Previous research found RBFNNs and DCNNs show promising results for structural problems, motivating their investigation in this study alongside the more classical FFNN approach.

Uploaded by

victo1111
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
47 views16 pages

Artificial Neural Networks in Structural Dynamics: A New Modular Radial Basis Function Approach vs. Convolutional and Feedforward Topologies

This document describes a study that develops and compares three types of artificial neural networks (ANNs) to predict structural deformations measured in experiments: a feedforward neural network (FFNN), a new modular radial basis function neural network (RBFNN), and a deep convolutional neural network (DCNN). The networks are trained using measurements from shock tube experiments on circular metal plates under high-speed loading and tested to determine which network type most accurately predicts measured plate deformations and vibrations. Previous research found RBFNNs and DCNNs show promising results for structural problems, motivating their investigation in this study alongside the more classical FFNN approach.

Uploaded by

victo1111
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 16

Available online at www.sciencedirect.

com
ScienceDirect

Comput. Methods Appl. Mech. Engrg. 364 (2020) 112989


www.elsevier.com/locate/cma

Artificial neural networks in structural dynamics: A new modular


radial basis function approach vs. convolutional and feedforward
topologies
Marcus Stoffel ∗, Rutwik Gulakala, Franz Bamer, Bernd Markert
Institute of General Mechanics, RWTH Aachen University, Templergraben 64, D-52056 Aachen, Germany
Received 20 December 2019; received in revised form 6 March 2020; accepted 6 March 2020
Available online xxxx

Abstract
The aim of the present study is to develop a series of artificial neural networks (ANN) and to determine, by comparison
to experiments, which type of neural network is able to predict the measured structural deformations most accurately. For
this approach, three different ANNs are proposed. Firstly, the classical form of an ANN in the form of a feedforward neural
network (FFNN). In the second approach a new modular radial basis function neural network (RBFNN) is proposed and the
third network consists of a deep convolutional neural network (DCNN). By means of comparative calculations between neural
network enhanced numerical predictions and measurements, the applicability of each type of network is studied.
⃝c 2020 Elsevier B.V. All rights reserved.

MSC: 74K20; 74-05


Keywords: Artificial neural network; Radial basis functions; Convolutional networks; Structural mechanics; Shock tube experiments

1. Introduction
Alternatively to classical approaches by means of continuum mechanics, artificial neural networks (ANN) have
been applied in engineering problems in recent years. This new trend was applied for one-dimensional stress
states in tension tests of metal specimens under high-temperature [1] and in the design of steel structures [2].
Vibrations of structures calculated by ANNs were described in [3,4] and stability problems of structures using
ANNs in [5]. Reliability studies of structures were reported in [6] and influences of welding on material properties
are investigated in [7]. Also for the embrittlement of steel pressure vessels with application to nuclear reactors, an
ANN was proposed [8]. Moreover, multiscale problems were presented in [9]. An ANN can lead to much lower
computational time and can replace the mechanical model completely. It can be trained by experimental data only,
and, therefore, needs no identification of material parameters [10]. Consequently, a mathematical model in form of
a neural network is generated, which is able to approximate an arbitrary function [11]. Following this approach, a
supervised learning algorithm approximates to the desired output data by training all parameters such as weights
and biases [12]. Consequently, the learning procedure of the ANN is based on examples and experiences provided
∗ Corresponding author.
E-mail address: stoffel@iam.rwth-aachen.de (M. Stoffel).

https://doi.org/10.1016/j.cma.2020.112989
0045-7825/⃝ c 2020 Elsevier B.V. All rights reserved.
2 M. Stoffel, R. Gulakala, F. Bamer et al. / Computer Methods in Applied Mechanics and Engineering 364 (2020) 112989

by the user [13]. However, weaknesses of ANNs can occur [14] due to difficulties in interpreting parameters in
neural networks, e.g. the number of hidden layers or neurons. Even more, components of synapse matrices of a
trained ANN are hard to interpret as it could be done with material parameters in a material law. For this reason,
problems with so-called black boxes are described [15,16]. This has an effect on the attempt to explain differences
between ANN-predictions and measurements. The classical ANN topology consists of FFNNs with summation
units. Once, the ANN has been trained well with input and output data sets, it can recalculate the provided data
very accurately. However, predictions beyond that data can lead to uncertain results, which is documented in
literature [17]. In the present study, a structural problem in form of shock wave-loaded metal plates is chosen.
In this way, a complex structural deformation together with strain rate dependent inelastic material behaviour is
captured in the measurement. The aim is to predict plate deformations and vibrations most accurately.
Another approach reported in the literature is the function approximation by means of RBFNNs. Thereby,
interference effects of wind loads in civil engineering were studied in [18]. Application to structural reliability
enhanced by an RBFNN was also investigated in [19]. In [20], the RBFNN is proposed as an alternative method
to multilayer perceptrons and to multi-variable regressions with application to tunnel convergence. The prediction
of backwater was investigated in [21] by artificial neural networks including also an RBFNN and by multilayer
perceptrons, which led to the best results in this investigation. In studies about rock strength deformations [22],
calculations of deformation moduli using RBFNNs led to more accurate results compared to classical empirical
methods. Another study about the determination of compressive rock strength [23] obtained most accurate results
by using FFNNs compared to RBFNNs. In [24], it was pointed out that both multilayer perceptrons and RBFNN
were statistically acceptable for predicting pressure–volume–temperature relations in oil. However, the RBFNN
results led to a higher accuracy compared to the classical neural network. Further developments of RBFNNs
for structural applications are giving rise to promising results in the prediction of structural deformations. This
perspective is additionally motivated by studies about radial basis functions applied to beam and plate deformation
and stability problems [25,26]. Structural integrity concerning damage assessment of structures was described in [27]
causing a better damage detection by means of RBFNN compared to classical FFNN. A clustering algorithm, which
calculated the output error of an RBFNN in each cluster, was described in [28]. In this way, a minimal number
of RBFs was obtained for a certain error value. An application of a RBFNN to an oscillating system was carried
out in [29]. In [30], the RBFNN turned out to exhibit the best prediction capability in structural health monitoring
of vibrating beams. The authors described a better interpolating capability in multi-dimensional space compared to
multilayer perceptrons. However, the RBFNN needs longer training time. Already in [4], the authors described that
regularisation neural networks using radial basis functions can be suitable for structural problems with noisy data.
Motivated by the promising results of the mentioned studies in structural dynamics, the RBFNNs are also used in
the present study.
Convolutional neural networks are, so far, mostly used for pattern or speech recognition [31] but not for
mechanical problems. Nevertheless, a DCNN can enhance the prediction of the dynamic response of structures
significantly [32], since this network type develops in numerical predictions a set of hidden layers by convolving
and pooling feature maps. Moreover, the use of kernels or filters leads to a reduction of input parameters, which
are not essential for the final output signal, e.g. reduction of noise. This incorporates the important characteristic
of DCNNs in form of shared weights for different input signals [33] leading to a reduction of parameters in large
network architectures. Based on native network models, such as partially overlapping regions in cortex neurons
of monkeys [34], the principle of weight sharing offers a powerful tool in DCNNs. So far, these techniques were
successfully applied in image recognition [35]. Following the approach of shared weights, DCNNs are transferred
to structural dynamics in the present study, since complex structural deformations can require large network
architectures with a high number of free parameters. Hereby, even networks with more than one hidden layer
could be regarded as a step towards deep learning [36]. In the present study, the DCNN serves as a more general
neural network as the FFNN, because the FFNN with an arbitrary number of hidden layers and neurons turns out
to be a special case of a fully-connected layer in the convolutional network. Furthermore, the DCNN allows the
usage of filters, which are introduced before establishing the final set of fully-connected hidden and output layers.
A further important advantage of the combination of DCNN with structural mechanics is, that the complete load
and deformation history of a structural problem can be captured in the network. Here, this effect is demonstrated by
correlating not only single input and output signals with each other, but complete sets of loading and deformation
evolutions over the regarded deformation period are integrated in the DCNN.
M. Stoffel, R. Gulakala, F. Bamer et al. / Computer Methods in Applied Mechanics and Engineering 364 (2020) 112989 3

Fig. 1. Deflection and pressure evolution in the shock tube for steel and copper plates with 138 mm diameter with picture of the set-up.

In order to combine dynamical problems of thin-walled structures with viscoplastic material behaviour, shock
wave-loadings on circular metal plates are considered in this paper. The neural network is trained by short time
measurements only. By comparative simulations between the described three types of neural networks among
themselves and with respect to the measurements, the applicability and accuracy of the investigated ANNs will
be determined.

2. Experiments
In the present study, shock tube experiments are chosen to cover a wide range of strain rates in the dynamic
response of structures. Following this approach, complex strain rate dependent evolutions of structural deformations
including geometrical and physical nonlinearities can be measured and provided to the developed neural networks.
In order to account for experiments with different diameters and materials, two experimental set-ups are shown in
Figs. 1 and 2. Steel, copper, and aluminium plates are subjected to impulsive loadings in these tubes leading to
inelastic deformations and vibrations. The training data is obtained by measuring the mid-point displacements and
the pressure acting on the plates. The signals are recorded by means of short-time measurement techniques during
the impulse period of several milliseconds. Details of these experimental studies can be found in [37]. The plate
specimens, used in this study, are aluminium plates of 553 mm diameter with 2 mm thickness and steel plates
with 138 mm diameter and 2 mm thickness. The shock tubes consist of a high pressure chamber (HPC) and a low
pressure chamber (LPC), separated from each other by a membrane. Due to the pressure difference between HPC
and LPC, the membrane bursts, causing a shock wave, which moves into the LPC and strikes the plate specimen at
the end of the tube. The pressure evolution on the plate specimen during the impulsive loading can be varied using
different gases in the HPC. The lighter the gas, such as helium, the faster the shock wave and, hence, the higher
the pressure on the plate specimen.
In Figs. 1 and 2, the shock tubes with two examples of plate deformations with nitrogen in the HPC are shown.
The time-dependent pressure, the stiffness of the specimen, and the shock wave propagation belong to the input
data used in the neural networks. The plate deflection denotes the output signal.

3. Artificial neural networks


For all three types of networks, the input and output values are normalised to obtain better convergence and
numerical stability [1,38]. Each physical value xi is transformed into a normalised value
( )
xi − ximin
xni = 0.1 + 0.8 · (1)
ximax − ximin
leading to unified values xni and with ximin and ximax as minimum and maximum values, respectively. The index
i stands for the ith input neuron. In the following, the mathematical model of all three types of networks will be
derived precisely, because the internal network structure has significant influence on the computational time and on
the accuracy of the output signals compared to the measurements.
4 M. Stoffel, R. Gulakala, F. Bamer et al. / Computer Methods in Applied Mechanics and Engineering 364 (2020) 112989

Fig. 2. Deflection and pressure evolution in the shock tube for aluminium plates with 553 mm diameter with picture of the set-up.

3.1. Feedforward neural network (FFNN)

The FFNN proposed in this study is shown in Fig. 3. The input layer consists of time t after the shock wave has
hit the plate, pressure p acting on the plate during the time, stiffness relation s of Young’s Modulus divided by plate
diameter, and wave propagation velocity v. Besides the pressure, the time is considered here as a path dependent
variable. The stiffness relation is taken into account due to different plate materials and geometries, and the wave
propagation has significant influence on the deformation rate of the shock wave-loaded structures. Following this
approach with four input variables, ordered pairs of values are used for each output such as the plates mid-point
displacement. Consequently, an input vector with normalised components xni is defined by
xnT = t n p n s n v n .
[ ]
(2)
This FFNN type, shown in Fig. 3, is well established in the literature [39]. It allows the use of several hidden
layers to increase the complexity of the function approximation. Instead of working with differential equations in
finite element codes, e.g. evolution equations in material laws, the neural network approximates the output values
by means of algebraic systems of equations including nonlinear activation functions. So, we end up in a matrix
multiplication as shown in Fig. 4. The components of the input vector (2) are multiplied with weights w ji between
input layer and hidden layer. This leads to a propagation function with the weighted sum and biases of all normalised
input values in the first forward pass to
N
∑ N +1

X1 j = w ji xni − bi = w ji xni (3)
i=1 i=1

with N as the number of input neurons. Bias terms bi can be added in order to shift the activation functions and
are included as additional weight with xn N +1 = −1 as shown in Eq. (3). These weights denote the free parameters
in this supervised learning approach to be determined by a learning algorithm with back propagation. For training
the weights in the neural network, a total number of z input vectors is available together with z output values. This
leads to a matrix multiplication with a weight matrix w and a matrix A containing z columns of input vectors xn .
A matrix B is obtained in the form
B = Aw (4)
with z vectors
X1T = X 11 · · · X 1L 1
[ ]
(5)
denoting the input for the neurons in the first hidden layer. In Fig. 4, the zth vectors of xn and X1 are indicated by
(z). Each component X 1 j is placed on the left hand side of the hidden neuron in Fig. 3. On its right hand side, the
M. Stoffel, R. Gulakala, F. Bamer et al. / Computer Methods in Applied Mechanics and Engineering 364 (2020) 112989 5

activated neuron is firing a signal calculated by means of the sigmoid function with
1
A1 j = (6)
1 + e−X 1 j
towards the neurons in the second hidden layer. A total number of m hidden layers with L m neurons plus one bias
term in each layer can be implemented. The weights and biases between the following hidden layers are denoted with
additional asterisks wk(∗..∗)
j and bk(∗..∗) , respectively, one for each subsequent hidden layer, see Fig. 3. Consequently,
the neurons in the second hidden layer obtain the weighted sum
L 1 +1

X 2k = wk∗j A1 j (7)
j=1

with L 1 + 1 as the number of neurons in the first hidden layer including bias. With all z vectors of activation
components A11 until A1L 1 , a matrix multiplication is obtained by
D = w∗ C (8)
leading to a matrix D including z times the input vector X2 for the neurons in the second hidden layer, each with
L 2 components. Inserting these values into the sigmoid function, the output signals for the neurons in the second
hidden layer are obtained by the analogous activation equation as in Eq. (6), only with j = k. This procedure is
carried out for all hidden layers with the weighted sum for the mth neuron in the sth hidden layer:
L s−1 +1

X sm = wmn
(∗..∗)
A(s−1)n . (9)
n=1
Finally, the weighted sum for activating the neuron in the output layer is calculated by
L∑
m +1
Yk = vk j A m j (10)
k=1
with weights vk j between last hidden layer and output layer. To stay generally, here, the weighted sum is calculated
with an arbitrary number of neurons o in the output layer, which is achieved by the matrix multiplication
G = v F. (11)
For one output neuron in the form of midpoint deflections, z output signals are obtained with component Y1 and
activation function A(m+1)1 from matrix G, which are denoting the normalised plate deflection dn :
1
A(m+1)1 = = dn . (12)
1 + e−Y1
This describes the forward pass in the proposed FFNN topology. For error back propagation, a gradient descent
algorithm is adopted, leading to optimal weights with the least square error between desired and calculated
output [40].

3.2. Radial basis function neural network (RBFNN)

The expectation in this study for applying RBFNNs to structural dynamics is that the oscillating deformation
behaviour can be approximated better with a smaller number of neurons as in the case of the FFNN. This assumption
is supported by the fact that the RBFNN develops a linear combination of Gaussian equations, which are affine
to vibrating deflections of structures as they are occurring in most of the desired outputs in this study. In addition
to the RBFNNs in literature, here, a new modular modification of this approach is proposed, leading to a network
with a significant increase of activation possibilities with Gaussian functions.
The classical RBFNN is shown in Fig. 5, composed of three layers and with radial basis activation functions in
the hidden layer. The inputs for the hidden layer are determined by the net-function in the form

 n
∑ ( )2
net j = ||x − µ|| = √ xi − µ ji (13)
i=1
6 M. Stoffel, R. Gulakala, F. Bamer et al. / Computer Methods in Applied Mechanics and Engineering 364 (2020) 112989

Fig. 3. Feed forward neural network (FFNN) with input layer, arbitrary number of hidden layers, and output layer.

Fig. 4. Algebraic system of equation with FFNN.

with µ ji denoting centre locations of activation functions, which lead to captured areas in the functional relationship
between activation function and input values. Here, from a variety of radial basis functions [41], the Gaussian
function is chosen, which provides a smooth transition between activated and non-activated areas. Hence, the
activation function for the hidden neurons is expressed by
net2j

Aj = e 2σ 2 (14)
with σ as the radius of a captured region for activation, which can also be regarded as a bias in an RBFNN, see [42].
Between the hidden and the output layer, the weighted sum is applied and forwarded to the output neurons, where
the identity as activation function A′ is used. This leads to an output in form of a linear combination of radial basis
M. Stoffel, R. Gulakala, F. Bamer et al. / Computer Methods in Applied Mechanics and Engineering 364 (2020) 112989 7

Fig. 5. Radial basis function neural network (RBFNN) with input layer, one hidden layer, and output layer.

function networks expressed by


( 2
)
n ∑m
∑ − i=1 xi −µ ji

A = wje 2σ 2 . (15)
j=1

This function approximation by means of the RBFNN does not exhibit a weighted sum between input and hidden
layer. For this reason, a matrix multiplication, as shown in Fig. 4 for the FFNN, is not conducted. Instead of this,
the input values for each hidden neuron have to be calculated separately. That means for the coding a loop over all
hidden neurons for each input value causing a distinct increase of computing time. One hidden neuron denotes one
centre, in which the neuron can be activated. In order to choose appropriate centre locations, the input vectors are
divided into a certain number of centres, which are equal to the number of hidden neurons. All other input vectors
are assigned to a centre with the smallest distance. In this way, the radius σ from Eq. (14) arises from this method
and is the same for all centres.
Even though the RBFNNs exhibit advantages in approximating complex functions [28], the areas of activation
in the hidden neurons are restricted to captured regions. The choice of their centres is up to the user and an
error back propagation is mostly not conducted with the centres µ ji in literature [43], but only carried out for
the weighted sum between hidden and output layer. The reason is a numerically circumstantial back tracking of
errors. The consequence of this approach is, that the centre locations µ ji as well as the radius σ are not treated
as parameters to be optimised. That means, on the one hand, the RBFNNs can be very beneficial, if the choice of
centre locations µ ji lead to the desired output, however, on the other hand, these locations lead to a very restrictive
activation field, which could cause insufficient results. For this reason, a modular RBFNN is proposed in the present
study to overcome too small activation fields and to allow larger activation areas. Modular neural networks exhibit
independent sub-networks for particular tasks and contribute their results to the whole final output [44]. Following
this intention, the RBFNN in Fig. 5 is transformed into a modular RBFNN topology shown in Fig. 6. Instead of m
hidden neurons in Fig. 5, m · n hidden neurons are introduced in Fig. 6 each receiving only one input signal, instead
of n input signals. According to n input neurons, also n modules are introduced in the network. After activation in
the hidden layers, all modules contribute their signals to one output neuron. Following this new network architecture,
a modified result of the activated output neuron is obtained as
n
∑ m
∑ −
(xi −µ ji )2
A′mod = wj e 2σ 2 (16)
j=1 i=1

with a net function in the hidden neurons of


net j = xi − µ ji (17)
8 M. Stoffel, R. Gulakala, F. Bamer et al. / Computer Methods in Applied Mechanics and Engineering 364 (2020) 112989

Fig. 6. Modular radial basis function neural network (RBFNN) with input layer, hidden layer with m · n neurons, and output layer.

Fig. 7. Activation functions for the radial basis function neural network (RBFNN) and its modular extension (MRBFNN).

leading still to a linear combination of radial basis functions, but with another summation order. However, as
indicated in Fig. 6, if centre locations µ ji are used between input and hidden neurons, then a weight w j from Fig. 5
is applied repeatedly. That means, due to the splitting, there are only m weights for m ·n connections necessary.
The concept of shared weights in artificial neural networks is admissible in literature and will be discussed in the
next chapter for DCNNs. The result for the extended activation areas being provided by Eq. (16) is shown in Fig. 7
as an example for n = 2 and m = 2. The activation signals in Fig. 7 denote the values of the output neurons in
Figs. 5 and 6. The classical approach by means of Fig. 5 leads to two peaks in form of Gaussian functions with
respect to two input signals. In the case of the modular variant of the RBFNN, four peaks with Gaussian functions
are caused, which are interconnected by corridors with exponential functions. This leads here to a wide spectrum
of activation opportunities.
M. Stoffel, R. Gulakala, F. Bamer et al. / Computer Methods in Applied Mechanics and Engineering 364 (2020) 112989 9

Fig. 8. Principle of a deep convolutional neural network with input and output data.

However, a method for fixing the centre locations for each hidden neuron has to be accomplished. Following
studies about centre estimation in literature [43], a clustering algorithm is developed here. The sum of input signals
per input neuron are divided by the number of hidden neurons. In this way, one centre can be located for each
hidden neuron.

3.3. Deep convolutional neural network (DCNN)

In the DCNN, developed in this study, ordered pairs of values are generated by correlating input data, such as
time, pressure, stiffness, and wave propagation velocity, to the output data in form of plate deflection. This was
carried out for the FFNN and RBFNN as well, however, in the case of the DCNN the complete evolution of all
four input variables over the entire loading period is read into the DCNN and correlated to the entire deformation
history. This is an important difference between DCNN and FFNN adopted in the present investigation. Following
this approach, the input and output data sets can be regarded as images of loading and deformation evolutions, such
as figures for pattern recognition, which are also composed of numbers. The authors are aware of the fact, that, here,
the input values are composed of sequential measurement data. However, in the proposed way, all path-dependent
deformations are included in feature maps in the network.
The algorithm of the developed DCNN is shown in Fig. 8. Input data maps from shock tube measurements are
read into the network in form of a matrix of size m × n by means of ASCII data. This is in accordance with studies
for pattern recognition, wherein pictures are converted to numbers to make them understandable for a convolutional
neural network. For each experiment, one sample matrix is provided to the network. Following the principle of
convolving matrices in a DCNN, different kernels with size i × k = p are applied to the input matrix leading to
submatrices, see Fig. 8. The kernels are chosen to filter out the normalised input data by means of the dot product
between input matrix and kernel. According to [45], the weighted sum for a neuron at the position e, f in the
activation map can be expressed by
i−1 ∑
∑ k−1
net j = X e+ℓ, f +h wℓ,h + b j (18)
ℓ=0 h=0
10 M. Stoffel, R. Gulakala, F. Bamer et al. / Computer Methods in Applied Mechanics and Engineering 364 (2020) 112989

Fig. 9. Different filters for deep convolutional neural networks.

with weights wℓ,h in each kernel j with size i × k and one bias b j per kernel. The values X e+ℓ, f +h denote the
components of the incoming layer at positions e + ℓ, f + h. By filtering the signals in this way, activation maps of
size
sizea = [(m − i)/sm + 1, (n − k)/sn + 1] (19)
are obtained with strides sm and sn , which stand for the steps the kernel is using during screening the input data map
in each direction. An arbitrary number of convolutions can be carried out, as it will be shown in the examples in the
next chapter. The scanning is also carried out through their entire depth of a set of activation maps. However, the
principle of shared weights in the kernels means, that their components keep constant during the entire screening of
the input data, which is an essential characteristic of DCNNs. The rectified linear unit (ReLU function) is applied
as activation function during convolution due to positive evaluations in the literature for DCNNs [32]. The function
is expressed by
if net j < 0
{
0
A(net j ) = max(0, net) = (20)
net j if net j ≥ 0.
After activation, additional convolutions can be carried out as shown Fig. 9 for one of the presented examples. In
Fig. 8, pooling kernels are applied after activation, leading to further reductions of the layer sizes. It depends on
the type of pooling, such as maximum or average pooling, if the components in the pooling layers represent the
highest or an average value of a screened part of the previous layer. With an activation map of q × r components
and a pooling kernel size w × z, the size of the corresponding pooling layer is reduced to
size p = [(q − w)/sq + 1, (r − z)/sr + 1] (21)
with strides sq and sr . In a subsequent step, components of the pooling layers are summarised into one vector by
flattening the layers. This is the starting point of a fully-connected layer. Additional hidden layers can be introduced,
leading finally to fully-connected layers. The last layer represents the output data denoting here the plate deflection.
In this part of the DCNN, the propagation function is expressed as
n

net j = X i wi j + b j (22)
i=1
M. Stoffel, R. Gulakala, F. Bamer et al. / Computer Methods in Applied Mechanics and Engineering 364 (2020) 112989 11

with weights wi j for interconnections between neurons in different consecutive hidden and output layers, b j for the
biases, and n stands for the number of neurons in the previous layer. Here, j denotes the number of a neuron in
the fully-connected layer in contrast to the number of a kernel as in Eq. (18). As activation function in the fully
connected layer, the sigmoid function
1
Aj = (23)
1 + e−net j
is chosen. In Fig. 9, the convolution process for one example, described in the results, is shown, wherein several
kernels are used, based on the principle of pattern recognition. In order to introduce a systematic procedure of
convolving input maps, here, filters for edge detection [46], see Fig. 9, are implemented. The authors are aware of
the fact, that the present input maps are composed of measurements. However, this method is applied to filter out
essential characteristics in the input data.
For all three types of networks the applied cost function uses the Root Mean Square Error (RMSE) in the form
√∑
p
k=1 (dm − dc )
2
RMSE = (24)
p
with dm , dc , p denoting the measured deflection, calculated deflection, and the number of displacement values in
one sample. The results of the RMSE are shown in Table 1. The entire algorithm is implemented using Python
including Keras and Tensorflow functions. The backpropagation method is used, which is known in literature for
determining weights and biases [47]. In order to account for overfitting problems, the dropout regularisation method
is applied [48]. However, each convergence of the following examples is investigated with different numbers of
epochs. As a stopping criterion, an RMSE between desired and simulated mid-point deflections of at least 1·10−1
is defined.

4. Results and discussion


All values in the graphs, shown in this section, are normalised quantities and exhibit, therefore, the unit one. In
experiments with steel plates, the recorded time range is 7 ms, for aluminium plates 5 ms, see Figs. 1 and 2. In the
following graphs, these time periods are expressed by data points to be read into the neural networks.
In Fig. 10, the measured midpoint displacement of a steel plate under impulsive loading in the small shock
tube, see Fig. 1, is shown together with three neural network predictions based on FFNN, RBFNN, and DCNN.
In the experiment, helium was used in the HPC to generate a fast shock wave, leading to a peak pressure at the
very beginning of the shock wave loading. The FFNN leads to a good prediction until to 120 data points. The
RBFNN simulation is nearly identical to the measurement and the DCNN prediction is not distinguishable from the
measured curve. The pressure is also indicated in the diagram. From Table 1 it can be determined that the DCNN
leads to the smallest difference between desired and predicted output signal, even with a smaller number of weights
and biases and less epochs compared to the FFNN. This effect is due to the shared weights principle. The RBFNN
seems to be a good compromise due an accurate result, see Fig. 10, but with a clearly smaller number of parameters
compared to the DCNN. However, the DCNN uses the entire evolution of all input and output signals for training.
That means, the DCNN accounts for the path-dependency of the deformation process. Or in other words, the entire
input map is correlated to one output map. In the FFNN and RBFNN only single ordered pairs of values from the
input data are mapped to output data.
Another steel plate deformation is presented in Fig. 11. Nitrogen was used in the HPC of the experimental
set-up in order to produce a pressure evolution, which reaches its maximum during the shock wave loading. Here,
the FFNN shows divergence to the experiment after 80 data points. The DCNN with a higher number of parameters,
see Table 1, matches the measurement again precisely. The RBFNN fails to predict this midpoint deflection. For
this reason, the modular form of RBFNN was applied, leading to the very precise midpoint displacement, shown in
Fig. 11. Obviously, the wider spectrum of possible activation areas allowed a better prediction of the desired output
than in the classical case of RBFNN. In this example, the modular RBFNN yields to a high accuracy in the output
neuron compared to the measurement with a comparably small number of parameters.
A set of steel plate deflections and oscillations is shown in Fig. 12. Here, different gases in the HPC, such
as nitrogen and helium, together with different membrane thicknesses between HPC and LPC were used. Hence,
different wave propagation speeds and different burst pressures lead to varied pressure loadings at the specimen. Four
12 M. Stoffel, R. Gulakala, F. Bamer et al. / Computer Methods in Applied Mechanics and Engineering 364 (2020) 112989

Fig. 10. Measurements and simulations using FFNN, RBFNN, and DCNN for a steel plate.

Fig. 11. Measurements and simulations using FFNN, modular RBFNN, and DCNN for a steel plate.

measurements are connected together and are fed in this order into the three types of neural networks. The FFNN
was not able to predict these deflections. With the RBFNN a similar curve compared to the measured one is obtained.
However, only after the use of the modular RBFNN, the oscillating mid point deflections were simulated better.
Both types of RBFNNs lead to similar cost function errors. For the DCNN, three different kernels are necessary
to obtain the precise prediction, as shown in Fig. 12, accompanied by a large number of parameters, see Table 1.
Here, the problem of choosing appropriate filters for convolving the input feature maps occurred. Following the
previously described strategy of pattern recognition, horizontal and vertical filters are applied, as shown in Fig. 9.
Even though this detection principle is used e.g. for images, it is applied here, since the input maps in the present
DCNN are composed of values.
In Fig. 13, four aluminium plate deformations, measured in the large shock tube from Fig. 2, are shown. Here,
the pressure loadings are also varied by using different materials for the membranes between the HPC and LPC.
For small pressures, hostaphan membranes with nitrogen in the HPC and for large pressures aluminium membranes
with helium in the HPC are used.
The desired deformation does not exhibit oscillations but larger deflections than in the case of the steel plates.
The FFNN and the DCNN are able to calculate these deformations. Again, the DCNN turned out to be the most
M. Stoffel, R. Gulakala, F. Bamer et al. / Computer Methods in Applied Mechanics and Engineering 364 (2020) 112989 13

Fig. 12. Measurements and simulations using FFNN, RBFNN and DCNN for a steel plate under repeated loads.

Table 1
Kernel, layers, weights/biases to be optimised, and RMS error between measurements and calculations.
Neural network Number of Number of Number of Number of weights Learning Number of RMSE
kernels hidden layers centres and biases rate epochs
FF Fig. 10 – 4 – 360 1 2 000 000 4.36·10−2
RBF Fig. 10 – 1 60 61 0.001 100 000 2.45·10−2
DC Fig. 10 2 2 – 354 1 3000 3.00·10−6
FF Fig. 11 – 3 – 168 1 2 000 000 9.13·10−3
MRBF Fig. 11 – 1 60 61 0.001 100 000 5.13·10−3
DC Fig. 11 2 2 – 354 1 3000 2.88·10−6
RBF Fig. 12 – 1 300 301 0.01 100 000 5.59·10−2
MRBF Fig. 12 – 1 200 201 0.001 100 000 5.62·10−2
DC Fig. 12 12 13 – 93 777 0.01 300 000 6.99·10−3
FF Fig. 13 – 3 – 556 1 10 000 1.27·10−2
DC Fig. 13 3 5 – 525 1 20 000 2.63·10−4

precise one, even with a smaller number of weights and biases compared to the FFNN case. The FFNN shows the
highest error compared to the measurement but stays within an acceptable range, see Table 1. The RBFNN fails
here to predict these monotonic mid point deflections. A reason therefore can be, that the advantage of the RBFNN
is to simulate signals, which are composed of Gaussian-like functions, which do not occur in this example.
In order to investigate the prediction capability of the trained networks, a validation is shown in Fig. 14. Here,
an additional experiment, similar to the one in Fig. 10, is used, but the pressure and displacements are smaller
then in Fig. 10. The pressure evolution, shown in Fig. 14 for validation, is applied to the already trained networks
leading to predicted mid-point deflections. However, the networks behave stable with respect to changes in input
parameters, e.g. a varied pressure. Consequently, the prediction by means of the FFNN and the RBFNN changes
only slightly compared to Fig. 10 and the DCNN calculation stays the same and is therefore not again shown in
Fig. 14.

5. Conclusions
In the present study three types of artificial neural networks were proposed for an application in structural
dynamics. The classical FFNN had its limits in the case of complex oscillating viscoplastic structural deformations
under repeated loads. Even for single loads divergences were visible after a special number of used data points.
The RBFNN was in each example with oscillating structural behaviour able to predict the desired output values.
However, it was necessary to extend this theory to a modular network extending the activation range of the hidden
14 M. Stoffel, R. Gulakala, F. Bamer et al. / Computer Methods in Applied Mechanics and Engineering 364 (2020) 112989

Fig. 13. Measurements and simulations using FFNN, RBFNN, and DCNN for an aluminium plate.

Fig. 14. Measurements and simulations using FFNN, RBFNN, and DCNN for a steel plate with cross-validation.

and, hence, of the output neuron. This modular topology of a RBFNN is to the knowledge of the authors new in
literature. The limits of the RBF approach were shown in the case of monotonic structural deflections. The DCNN
turned out to be the most powerful network type, generating a set of layers by filtering the input feature maps.
However, in the case of RBFNN less parameters were necessary for obtaining very similar results as in the case
of DCNN. An additional advantage of the DCNN is that it uses the entire evolution of input and output values
during training, which is essential for path-dependent deformations. In a validation with additional test-data, the
simulations using all three types of networks behaved stable, if changes in the input data occurred.

Acknowledgement

This research was funded by the Excellence Initiative of the German federal and state governments.
M. Stoffel, R. Gulakala, F. Bamer et al. / Computer Methods in Applied Mechanics and Engineering 364 (2020) 112989 15

References
[1] M. Shakiba, N. Parson, X.-G. Chen, Modeling the effects of cu content and deformation variables on the hight-temperature flow
behavior of dilute al-fe-si alloys using an artificial naural network, Materials 9 (536) (2016) 1–13.
[2] A.-A. Chojaczyk, A.-P. Teixeira, C. Luìs, J.-B. Cardosa, C.-G. Soares, Review and application of artificial neural networks models in
reliability analysis of steel structures, Struct. Saf. 52 (A) (2015) 78–89.
[3] G.-R. Liu, Y.-G. Xu, Z.-P. Wu, Total solution for structural mechanics problems, Comput. Methods Appl. Mech. Eng. 191 (2001)
989–1012.
[4] Z. Waszczyszyn, L. Ziemiański, Neural networks in mechanics of structures and materials - new results and prospects of applications,
Comput. Struct. 79 (2001) 2261–2276.
[5] Z.-R. Tahir, P. Mandal, Artificial neural networks prediction of buckling load of thin cylindrical shells under axial compression, Eng.
Struct. 152 (2017) 843–855.
[6] J. Cheng, Q.-S. Li, Reliability analysis of structures using artificial neural network based genetic algorithms, Comput. Methods Appl.
Mech. Eng. 197 (2008) 3742–3750.
[7] D. Zhao, D. Ren, K. Zhao, S. Pan, X. Guo, Effect of welding parameters on tensile strength of ultrasonic spot welded joints of
aluminum to steel – By experimentation and artificial neural network, J. Manuf. Process. 30 (2017) 63–74.
[8] J. Mathew, D. Parfitt, K. Wilford, N. Riddle, M. Alamaniotis, A. Chroneos, M.-E. Fitzpatrick, Reactor pressure vessel embrittlement:
Insights from neural network modelling, J. Nucl. Mater. 502 (2018) 311–322.
[9] G. Capuno, J. Rimoli, Smart finite elements: A novel machine learning application, Comput. Methods Appl. Mech. Engrg. 345 (2019)
363–381.
[10] M. Lefik, D.-P. Boso, B.-A. Schrefler, Artificial neural networks in numerical modeling of composites, Comput. Methods Appl. Mech.
Eng. 198 (2009) 1785–1804.
[11] T. Chen, H. Chen, Universal approximation to non-linear operators by neural networks with arbitrary activation functions and its
application to dynamical systems, IEEE Trans. Neural Netw. 6 (4) (1995) 911–917.
[12] A. Ajmani, D. Kamthania, M.-N. Hoda, A comparative study on constructive and non- constructive supervised learning algorithms for
artificial neural networks, in: Proceedings of the 2nd National Conference, Bharati Vidyapeeth’s Institute of Computer Applications
and Management, INDIACom, New Delhi, 2008.
[13] S. Mandal, P.-V. Sivaprasad, S. Venugopal, K.-P.-N. Murthy, Artificial neural network modeling to evaluate and predict the deformation
behavior of stainless steel type aisi 304l during hot torsion, Appl. Soft Comput. 9 (2009) 237–244.
[14] A.-A. Javadi, M. Rezania, Applications of artificial intelligence and data mining techniques in soil modeling, Geomech. Eng. 1 (1)
(2009) 53–74.
[15] M. Shojaeefard, M. Akbari, M. Tahani, F. Farhani, Sensitivity analysis of the artificial neural network outputs in friction stir lap joining
of aluminium to brass, Adv. Mater. Sci. Eng. 2013, ID 574914 (2013) 1–7.
[16] M. Lu, S.-M. AbouRizk, U.-H. Hermann, Sensitivity analysis of neural networks in spool fabrication productivity studies, J. Comput.
Civ. Eng. 15 (4) (2001) 299–308.
[17] R. Kaunda, New artificial neural networks for true triaxial stress state analysis and demonstration of intermediate principal stress effects
on intact rock strength, J. Rock Mech. Geotech. Eng. 6 (2014) 338–347.
[18] A. Zhang, L. Zhang, Rbf neural networks for the prediction of building interference effects, Comput. Struct. 82 (2004) 2333–2339.
[19] J. Deng, Structural reliability analysis for implicit performance function using radial basis function network, Int. J. Solids Struct. 43
(2006) 3255–3291.
[20] S. Mahdevari, S. Torabi, Prediction of tunnel convergence using artificial neural networks, Tunn. Undergr. Space Technol. 28 (2012)
218–228.
[21] E. Pinar, K. Paydas, G. Seckin, H. Akilli, B. Sahin, M. Cobaner, S. Kocaman, M. Atakan Atar, Artificial neural network approaches
for prediction of backwater through arched bridge constrictions, Adv. Eng. Softw. 41 (2010) 627–635.
[22] H. Mohammadi, R. Rahmannejad, The estimation of rock mass deformation modulus using regression and artificial neural networks
analysis, Arabic J. Sci. Eng. 35 (1A) (2010) 205–217.
[23] M. Hassanvand, S. Moradi, M. Fattahi, G. Zargar, M. Kamari, Estimation of rock uniaxial compressive strength for an iranian carbonate
oil reservoir: Modeling vs. artificial neural network application, Pet. Res. 3 (2018) 336–345.
[24] A. Fath, F. Madanifar, M. Abbasi, Implementation of multilayer perceptron (mlp) and radial basis function (rbf) neural networks to
predict solution gas-oil ratio of crude oil systems, Petroleum (2018) http://dx.doi.org/10.1016/j.petlm.2018.12.002.
[25] D. Bettebghor, F.-H. Leroy, Overlapping radial basis function interpolants for spectrally accurate approximation of functions of
eigenvalues with application to buckling of composite plates, Comput. Math. Appl. 67 (2014) 1816–1836.
[26] X. Li, C. Gong, L. Gu, W. Gao, Z. Jing, H. Su, A sequential surrogate method for reliability analysis based on radial basis function,
Struct. Saf. 73 (2018) 42–53.
[27] V. Vallabhaneni, D. Maity, Application of radial basis neural network on damage assessment of structures, Procedia Eng. 14 (2011)
3104–3110.
[28] H. Pomares, I. Rojas, M. Awad, O. Valenzuela, An enhanced clustering function approximation technique for a radial basis function
neural network, Math. Comput. Modelling 55 (2012) 286–302.
[29] A. Ulasyar, H. Zad, A. Zohaib, S. Hussain, Adaptive radial basis function neural network based tracking control of van der pol
oscillator, in: 2017 International Conference on Communication Technologies (ComTech), Rawalpindi, 2017, pp. 111–115.
[30] K. Aydin, O. Kisi, Damage diagnosis in beam-like structures by artificial neural networks, J. Civ. Eng. Manage. 21 (5) (2015) 591–604.
[31] J. Shao, Y. Qian, Three convolutional neural network models for facial expression recognition in the wild, Neurocomputing 355 (2019)
82–92.
16 M. Stoffel, R. Gulakala, F. Bamer et al. / Computer Methods in Applied Mechanics and Engineering 364 (2020) 112989

[32] R.-T. Wu, R. Jahanshahi, Deep convolutional neural networks for structural dynamic response estimation and system identification, J.
Eng. Mech. 145 (1) (2019) 1–25.
[33] Z. Wang, C. Li, P. Lin, M. Rao, Y. Nie, W. Song, Q. Qiu, Y. Li, P. Yan, J. Strachan, N. Ge, N. McDonald, Q. Wu, M. Hu, H. Wu,
R. Williams, Q. Xia, J. Yang, In situ training of feed-forward and recurrent convolutional memristor networks, Nat. Mach. Intell. 1
(2019) 434–442.
[34] D. Hubel, T. Wiesel, Receptive fields and functional architecture of monkey striate cortex, J. Physiol. 195 (1968) 215–243.
[35] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proc. IEEE Conference on Computer Vision and
Pattern Recognition, 2016, pp. 770–778.
[36] J. Schmidhuber, Deep learning in neural networks: An overview, Neural Netw. 61 (2015) 85–117.
[37] M. Stoffel, Evolution of plastic zones in dynamically loaded plates using different elastic-viscoplastic laws, Int. J. Solids Struct. 41
(2004) 6813–6830.
[38] N. Kiliç, E. Bülent, S. Hartomacioğlu, Determination of penetration depth at high velocity impact using finite element method and
artificial neural network tools, Def. Technol. 11 (2015) 110–122.
[39] V.-K. Ojha, A. Abraham, V. Snášel, Metaheuristic design of feedforward naural networks: A review of two decades of research, Eng.
Appl. Artif. Intell. 60 (2017) 97–116.
[40] M. Stoffel, F. Bamer, B. Markert, Artificial neural networks and intelligent finite elements in non-linear structural mechanics,
Thin-Walled Struct. 131 (2018) 102–106.
[41] Z. Zainuddin, O. Pauline, Function approximation using artificial neural networks, Int. J. Syst. Appl. Eng. Dev. 1 (49) (2007) 173–178.
[42] A. Engelbrecht, Computational Intelligence, An Introduction, John Wiley & Sons, Ltd, 2007.
[43] K. Mehrota, C. Mohan, X.-G. Chen, Elements of Artificial Neural Networks, MIT Press, 1997.
[44] R. Chandra, A. Gupta, Y.-S. Ong, C.-K. Goh, Evolutionary multi-task learning for modularknowledge representation in neural networks,
Neural Process. Lett. 47 (2018) 993–1009.
[45] M. Nielsen, Neural networks and deep learning, Determination Press, 2015.
[46] M. Jaderberg, A. Vedaldi, A. Zisserman, Speeding up convolutional neural networks with low rank expansions, 2014, CoRR
abs/1405.3866. arXiv:1405.3866.
[47] M. Huk, Backpropagation generalized delta rule for the selective attention sigma-if artificial neural network, Int. J. Appl. Math. Comput.
Sci. 22 (2) (2012) 449–459.
[48] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: A simple way to prevent neural networks from
overfitting, J. Mach. Learn. Res. 15 (2014) 1929–1958.

You might also like