(2022), Bararnia and Esmaeilpour, On The Application of Physics Informed Neural Networks (PINN) To Solve Boundary Layer Thermal-Fluid Problems
(2022), Bararnia and Esmaeilpour, On The Application of Physics Informed Neural Networks (PINN) To Solve Boundary Layer Thermal-Fluid Problems
(2022), Bararnia and Esmaeilpour, On The Application of Physics Informed Neural Networks (PINN) To Solve Boundary Layer Thermal-Fluid Problems
A R T I C L E I N F O A B S T R A C T
Keywords: Deep neural network is a powerful technique in discovering the hidden physics behind the transport phenomena
Physics-informed neural networks through big-data training. In this study, the application of physic-informed neural networks is extended to solve
Boundary layer flow viscous and thermal boundary layer problems. Three benchmark problems including Blasius-Pohlhausen, Falk
Heat transfer
ner-Skan, and Natural convection are selected to investigate the effects of nonlinearity of the equations and
Machine learning, nonlinear ODEs
unbounded boundary conditions on adjusting the network structure’s width and depth, leading to reasonable
solutions. TensorFlow is used to build and train the models, and the resulted predictions are compared with those
obtained by finite difference technique with Richardson extrapolation. The results reveal that the Prandtl number
in the heat equation is a key factor which its value drastically changes the required number of neurons and layers
to achieve the desired solutions. Also, setting the unbounded boundary at a higher distance from the origin
demands an adequate number of layers and correspondingly neurons to deal with the infinity boundary con
dition. Finally, trained models are successfully applied to unseen data to evaluate the boundary layer thicknesses.
1. Introduction transfer field in recent years. It is interesting to note that the application
of shallow networks (networks with zero or one hidden layer) in fluid
Machine learning is a promising science in statistical methods with a mechanics emerged in the early 1990s [52] to track the low-density
diverse application in forecasting, pattern/speech recognition, optimi particles’ velocity through analyzing images (PIV), which further
zation, time-series analysis, and image processing [19,33,41]. A neural improved by multi-layer networks [20]. In the context of computational
network is a branch of machine learning concept that mimics the human fluid dynamics (CFD), the multi spatiotemporal nature of turbulent flow
brain’s biological computing algorithm. The neural network field has and stochastic formation of coherent structures drew the attention of
gained significant interest in various areas of science. The applications scientists to reduced-order models aimed at capturing the dominant
include but are not limited to computational material and chemistry physics of a phenomenon of interest. Milano and Koumoutsakos [34]
science [2,18], the pharmaceutical industry [9], biomedical engineering were among the first who explored the capability of a multi-layer
[25], forecasting sciences [38], robotics [42], aerospace, and mechani network for constructing the near-wall turbulent flow. Further in
cal engineering [23,37]. The motivation behind using neural networks vestigations have been exploring neural networks’ applications in tur
emanates from providing a universal approximation for continuous bulent modelings [32,49,51]. Neural networks have also been used in
functions even with a single hidden layer in their architecture [21] for classification problems attributed to the identification of different re
the case of simple functions. The complexity of the functions brings the gimes in multiphase flow (bubbly, slug, plug, etc.) [1,54] or detecting
necessity of adding more layers (i.e., “deep learning”) and neurons (i.e., the various modes of vortexes forming behind an airfoil [12].
sufficiently wide layers) to the network [11,15]. For instance, in fluid Neural networks have been employed to obtain the partial differ
mechanics, the high-dimensionality and nonlinearity of the problems ential equations’ solutions by reducing the problem to an optimization
present a significant challenge, especially in modeling and extracting the process rather than numerically solving the equations. Among the early
fluid features from big data [8]. Therefore, several efforts have been efforts in this topic, one can be referred to Dissanayake and Phan-Thien
dedicated to employ neural networks in the fluid mechanic and heat [13] or Lagaris et al. [27]. In the method proposed by Lagaris et al. [27],
* Corresponding author.
E-mail address: [email protected] (M. Esmaeilpour).
1
These authors contributed equally to this work.
https://doi.org/10.1016/j.icheatmasstransfer.2022.105890
a trial solution is assumed to satisfy the ordinary/partial differential industrial applications in cooling systems of electronic devices and many
equation. The trial solution is written as the sum of two terms, such that manufacturing processes. In the natural convection scenario, both
the first term is responsible for dealing only with the boundary/initial viscous and thermal boundary layers develop simultaneously, induced
conditions while the second term is a neural network system that does by the velocity and temperature gradient, respectively, in the presence
not contribute to satisfying initial/boundary conditions yt = A(x) + F(x, of wall and gravity effects. The primary studies on this topic were done
N(x, p)). Here, x indicates the independent variable(s) of the equation. by Schmidt and Beckmann [50] and Ostrach [39], who experimentally
The parameter p corresponds to biases, and the network’s weights must and theoretically studied the free convection flow of air subject to the
be adjusted to minimize the residuals. Although the procedure of the gravitational force about a vertical flat plate. Introducing the boundary
solution seems straightforward and it has been shown that even a layer theory and obtaining a closed-form solution for the governed
shallow network is capable of solving the equations, finding the trial equations had always been a challenge. The difficulties lie in two as
solution might be arduous, particularly in time-dependent and multi- pects: 1) the two dimensional coupled nonlinear partial differential
coordinate problems with complicated geometries. Recently, re equations, 2) unbounded boundary conditions (A zero-gradient of
searchers at Brown University [46–48] developed a deep neural network quantities occurring at infinity limit). Although the first issue was suc
method for solving nonlinear partial differential equations (PDEs) cessfully addressed by introducing the similarity variables and con
without requiring a trial function. The method is based on introducing a verting the PDEs to nonlinear ODEs in one coordinate, the second
general neural network solution that requires satisfying both initial/ challenge required special techniques to deal with unbounded boundary
boundary conditions as well as the corresponding PDE. Therefore, the conditions. While numerical methods like Runge-Kutta method [22]
necessity of decomposing the answer into two terms has been elimi could reasonably estimate the infinity value, exact solutions were not
nated. The motivation stems from the fact that contrary to the conven the case. Therefore, finding an approximate or analytical solution has
tional supervised machine learning problems, which require a given been the subject of many studies. Most of these techniques rely on
output labeled data, here in the context of solving equations, the actual polynomial expansions of the solution, e.g., the Adomian Decomposition
outputs, which are the solution of the equations, are unknown. Hence, a Method [5] and the Homotopy Perturbation Method [16]. In these
valid question that can be posed here is how the corresponding loss methods, the primary step of the solution procedure initiates with
function can be defined in that scenario? The method is termed a selecting the linear part of the equation as an invertible linear operator;
“Physic-Informed Neural Network (PINN)” since the dilemma of therefore, a polynomial solution is obtained, and one requires consid
defining a proper loss function is resolved by prior knowledge of the ering a specific value for infinity since taking the derivative at infinity is
underlying physics of the problem elucidated by the governed equa impractical when a function is expanded in polynomial format. On the
tions. In other words, the ultimate result of the output layer is expected other hand, a Pade’ approximation can convert the final polynomial
to satisfy the governing equations for every node located inside the solution to a ratio that enables us to take the limit when the independent
domain (collocated nodes) in addition to boundary nodes. Therefore, the parameter approaches infinity [3]. Liao [30,31], for the first time,
loss function is defined based on two terms: 1) the residual value introduced an analytic technique termed the Homotopy Analysis
resulted from substituting the predicted output of the neural network in Method in which an arbitrary linear operator can be employed, yielding
the corresponding equations, and 2) the discrepancy between the ex the solution to take the exponential formats rather than polynomial
pected value and the estimated one at the boundaries. The summation of expansions. At the same time, its convergence can be checked by
these losses as a total loss undergoes a minimizing process by optimizing adjusting appropriate auxiliary parameters. As a result, the method
the weights. The detail regarding the method is discussed in the automatically recognizes the final infinity value rather than forcefully
following sections. Since the introduction of the technique, it has been setting an arbitrary infinity value.
widely employed in different branches of science. For instance, in the In the concept of the neural network, the benchmark problem of
domain of CFD and heat transfer, Rad et al. [45] successfully simulated a Blasius (viscous flow boundary layer) has been solved [36] by the
solidification problem in a two-dimensional domain. Zobeiry and method proposed by Lagaris [27] (trial function) or a Hybrid approach
Humfeld [56] solved a heat conduction problem with a convective [4]. It should be noted here that the Blasius equation is a single equation
boundary condition and compared their results with the Finite element representing only the viscous boundary layer. The main objective of this
method (FEM), and Wang et al. [53] reconstructed the velocity, tem work is to explore the application of PINN in solving the broader aspects
perature distributions induced by a natural convection mechanism of the boundary layer flow equations with unbounded domains, which
within the enclosure. Moreover, Chen et al. [10] showed the application to the best of our knowledge, is the first study employing PINN in this
in inverse problems related to the optics field. field. The PINN technique will be used to solve a system of coupled
The theory of boundary layer flow is a classical problem in the nonlinear thermal and viscous boundary layer equations derived by
thermal-fluid field, having numerous applications in engineering and similarity solutions for three well-known cases (Blasius-Pohlhausen,
industrial processes. For example, determining the boundary layer Falkner-Skan, and Natural convection). In addition, the effect of width
thickness (thermal and hydraulic) beside the gradient of flow quantities and depth of the proposed neural network on correctly estimating the
(temperature and velocity) enables the engineers to reduce the drag infinity value will be discussed. Moreover, it will be demonstrated that
force on airfoils, turbine blades, and ship hull, [6,14,28,35,44], leading how the learning rate as one of the hyper-parameter can result in un
to a substantial decrease of fuel consumption and improved efficiency. reasonable results when it is not set sufficiently low. Unlike the Blasius
The celebrated boundary layer theory was first introduced by Ludwig equation, the other two problems involve some non-dimensional pa
Prandtl [43] in a paper entitled “On the motion of a fluid with very small rameters whose values highly affect the problem’s nonlinearity. Finally,
viscosity” in 1904, in which he presented the mathematical basis of a thorough investigation of the network architecture will be performed
flows for high Reynolds numbers and simplified the two-dimensional to address the challenge of nonlinearity to find results that are reason
Navier-Stokes equations into the boundary layer equations. This paper ably in good agreement with those obtained by the numerical method.
has been considered as the onset of modern fluid mechanics and opened The paper is structured as follows: section (2) presents the governing
the gates to understanding fluid motion physics. Later, Blasius [7] pre equations corresponding to the three cases. Then, section (3) outlines
sented a power series solution for the boundary layer equation of flow the PINN essential elements and the solution procedure, and finally
over a flat plate. Accordingly, the generalization format of the Prandtl- section (4) details the discussion of the results.
Blasius equation to non-zero pressure gradient flow over a fixed and
impermeable wedge was introduced by Falkner and Skan [17]. Another
example of classical boundary layer problems is the natural convection
boundary layer flow over a vertical flat plate which has numerous
2
H. Bararnia and M. Esmaeilpour International Communications in Heat and Mass Transfer 132 (2022) 105890
Fig. 1. The Schematic of (A) Blasius-Pohlhausen problem, (B) Falkner-Skan problem, and (C) Natural convective flow over a vertical flat plate. δ(x) and δt(x) indicate
viscous and thermal boundary layers, respectively. Note that other parameters in the schematics are already introduced in the corresponding equations.
∂u ∂u ∂2 u 1
(2) (11)
′ ′
u + v = ν 2 + gβ(T − T∞ ) θ ′ + Prf θ = 0
∂x ∂y ∂y 2
T(x, 0) = Ts , T(x, ∞) = T∞ , T(0, y) = T∞ (5) The governing Eqs. (10)− (11) along with boundary conditions (13)–
(14) define the two-dimensional convective fluid flow over a flat plate,
The first two boundary conditions in Eq. (4) refer to the no-slip ve which is equivalent to classical Blasius-Pohlhausen equations for viscous
locity condition, while the last boundary condition in Eq. (4) refers to flow. The velocity in the y-direction and local friction on the flat plate
the constant free stream velocity outside the boundary layer. According can be calculated as:
to Eqs. (2)-(3), the momentum and energy equations are coupled. √̅̅̅̅̅̅̅̅̅
However, the buoyancy force presented in the momentum equation can 1 νU∞ ′
v= − [ηf (η) − f (η) ] (15)
be neglected if a pressure gradient is perpendicular to the gravitational 2 x
force. Therefore, in the case of the forced convection over a horizontal √̅̅̅̅̅̅̅
( )⃒
flat plate, the solution of the momentum equation is decoupled from the ∂u ⃒⃒ U 3∞ ′ ′
energy equation. In contrast, the energy equation solution is still linked τ=μ =μ f (0) = − μω(0) (16)
∂y ⃒y=0 νx
to the momentum solution [24]. Introducing the stream function ψ (x, y)
as: where ω is the vorticity generated by the boundary layer flow. From the
∂ψ (x, y) ∂ψ (x, y) expression for the temperature, the heat flux into the fluid, qw, the heat
u= ,v = − (6) transfer coefficient from the wall to the liquid, h, and the local Nusselt
∂y ∂x
number, Nux, can be expressed as:
can automatically satisfy the continuity Eq. (1). The stream function √̅̅̅̅̅̅̅
∂T U∞ ′
ψ (x, y) is defined as: qw = − kf |y=0 = (Ts − T∞ ) θ (0) (17)
∂y νx
3
H. Bararnia and M. Esmaeilpour International Communications in Heat and Mass Transfer 132 (2022) 105890
√̅̅̅̅̅̅̅
qw U∞ ′ 2.3. Free convection boundary layer: flow over vertical flat plate
h= | = kf θ (0) (18)
Ts − T∞ y=0 νx
Consider the laminar natural convection flow of an incompressible
√̅̅̅̅̅̅̅
hx U∞ ′ viscous fluid over a vertical flat plate. The flow is parallel to the direc
Nux = = θ (0) = Rex θ′ (0) (19) tion of buoyancy force, and the flat plate has a constant temperature of
kf νx
Ts which is greater than the temperature of the surrounding fluid, T∞. It
2.2. Convective flow over a wedge: Falkner-Skan equation is assumed that the flow is two-dimensional (Fig. 1C). The mass, mo
mentum, and energy equations for such a fluid flow can be expressed as:
Consider the viscous fluid flow with constant and uniform free ∂u ∂v
stream velocity, U∞, and temperature, T∞, over a wedge. The wall of the + =0 (33)
∂x ∂y
wedge has constant temperature of Ts, greater than the free stream’s
temperature. It is assumed that the flow in the laminar boundary layer is ∂u ∂u ∂2 u
two-dimensional (Fig. 1B). Assuming that the temperature changes due u + v = ν 2 + gx β(T − T∞ ) (34)
∂x ∂y ∂y
to viscous dissipation are small, the simplified Navier-Stokes and energy
equations may be expressed as: ∂T ∂T ∂2 T
u +v =α 2 (35)
∂u ∂v ∂x ∂y ∂y
+ =0 (20)
∂x ∂y The boundary conditions for such a flow are given by:
[ ]1/4
where m is the Falkner-Skan power-law parameter and L is the length of gβ(Ts − T∞ )x3
the wedge. Introducing the similarity variable η, f(η), and non- ψ (x, y) = 4νf (η) (39)
4ν2
dimensional form of the temperature θ(η) as:
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ T − T∞
(m + 1)U∞ y θ(η, Pr) = (40)
η(x, y) = (25) Ts − T∞
2νLm x(1− m)/2
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ and substituting Eqs. (38)–(40) into the mass, momentum, and energy
(m + 1)Lm ψ Eqs. (33)–(35) will result in a set of coupled ordinary differential
f (η) = (26)
2νU∞ x(1+m)/2 equations as:
(41)
′ ′ ′ ′
f ′ − 2f 2 + 3ff ′ + θ = 0
T − Ts
θ(η) = (27)
T∞ − Ts
(42)
′ ′
θ ′ + 3Prf θ = 0
and substituting relations (25)–(27) into the mass, momentum, and The corresponding boundary conditions can also be obtained from η,
energy Eqs. (20)− (22) will result in a set of coupled ordinary differential f, θ as:
equations as:
(43)
′ ′
( ′ )
f (0) = 0, f (0) = 0, f (∞) = 0
(28)
′ ′ ′
f ′ + ff ′ + β 1 − f 2 = 0
θ(0) = 1, θ(∞) = 0 (44)
(29)
′ ′
θ ′ + Prf θ = 0
The governing Eqs. (41)–(42) along with boundary conditions (43)–
It should be noted that m and β are related through the following (44) define the two-dimensional free convection flow over a vertical flat
equation: plate which is another example of boundary layer flow problem. From
2m the expression for the temperature, the shear stress on the plate and the
β= (30) local Nusselt number can be expressed as [26]:
m+1
√̅̅̅ [ ] √̅̅̅
The corresponding boundary conditions can also be obtained as: ∂u
τw = μ |y=0 = μν 2
2 gβ(Ts − T∞ )x3 34 ′ 2 3
f ′(0) = μν 2 Gr4x f ′ ′(0) (45)
∂y x ν2 x
(31)
′ ′
f (0) = 0, f (0) = 0, f (∞) = 1
[ ]
hx 1 ′ gβ(Ts − T∞ )x3 14 1
θ(0) = 1, θ(∞) = 0 (32) Nux = = − √̅̅̅θ (0) = − √̅̅̅Gr1/4 ′
x θ (0) (46)
kf 2 ν2
2
The governing Eqs. (28)–(29) along with the boundary conditions
(31)− (32) define the two-dimensional convective fluid flow over a where Grx, is the dimensionless Grashof number.
wedge which is equivalent to the classical Falkner-Skan boundary layer
problem.
4
H. Bararnia and M. Esmaeilpour International Communications in Heat and Mass Transfer 132 (2022) 105890
Fig. 2. The sketch of essential elements in feedforward neural network, (A) discretization of the one-dimensional domain, (B) a feedforward neural network
structure, the corresponding computational algorithm and important parameters used to solve the ODEs.
3. Physics-informed neural networks (PINN) each xi), reasonably predict the desired value (yi) (sample output data).
A loss function quantifies the difference between (yi) and (̂
y i ), in a mean
The main structure of a neural network consists of a series of parallel- squared error format, defined as:
distributed layers termed input, hidden, and output layers organized in a [ ]/
∑
i=m
row receiving the input data and producing the output results. We start Loss = (yi − ̂y i )2 m (47)
elucidating the essential steps of applying PINN to solve the aforemen i=0
tioned equations by dividing the one-dimensional domain in direction
The function F is a series of computations performed on all neurons
(η) (Fig. 2A) into discrete points (nodes). These nodes represent the in
constructed the network. Before elucidating how the function works, we
dependent variable of the equations (input data) varying from zero to
should take a look at Fig. 2B, which shows a network architecture in
η∞. The major difference between PINN and the method proposed by
which the input (ηi) is fed to the network. The neural network sketched
Lagaris et al. [27] is that in the latter method, before proceeding to solve
in Fig. 2B depicts that the first layer has only one neuron which stores
the equation, one needs to suggest a trial function to satisfy the
the input data (samples). After that, there are three hidden layers with
boundary conditions and a second term that only satisfies the equation
an arbitrary number of neurons (just the first two nodes and the last
(trained by network) with no contribution in boundary or initial con
node were shown for the sake of simplicity, and the rest showed by dots).
ditions. If a trial function is chosen such that the quantity or its deriv
Finally, the output layer has two nodes. The number of neurons at the
ative yield to zero or one when the independent parameter (η)
output layer is according to the number of unknown functions which one
approaches infinity (∞) (depending on the boundary conditions), then
intends to solve. The network is constructed by layers (k = 0, 1, 2, ⋯, l),
the whole solution is likely to be obtained. Finding these two terms
and each layer involves several neurons (j = 1, 2, ⋯, n). The neurons are
before solving the equations is noticeably challenging, especially in
fully connected via connections (links: black arrows). The role of a link is
scenarios with complicated boundary conditions. On the other side, in
to transfer the information between two neurons. To better understand
PINN, there is no such a requirement, and the network trains the whole
the essential steps and computations performed by a single neuron, a
solution by considering penalties to satisfy both boundary/initial con
representative neuron is shown in Fig. 2B (inset). The links from the
ditions and the equation. Therefore, in PINN, we need to consider a
neurons of the previous layer (k − 1) showed by green arrows contain
finite value for the (∞). In the following section, we discuss the critical
the weighted input data. A weight is assigned to each link (wi, jk), which
point that should be taken regarding the reasonable guess made for
dampens or amplifies the data coming from the source node (j) before
estimating (∞) in more detail.
entering the destination node (representative node (i)). Thereby, the
Fig. 2A shows two types of nodes, red colored (collocated nodes) and
weights determine the importance of the signal sent to the node or, in
blue colored (boundary nodes) ones. Note that we are dealing with
other words, how much each node contributes to the output result of the
steady-state equations. Therefore, the only boundary conditions
representative node. Then, to consider the contribution of all the nodes
required to close the solutions are Dirichlet and Neumann boundary
on the target node (i), the sum of weighted inputs is calculated, and
conditions, shown in Fig. 2B by (D) and (N), respectively. The η-axis is
finally, a bias is added to the resulted sum:
discretized into m equal divisions (ηi, i = 0, 1, 2, ⋯, m) such that ηi = i ×
( )
Δ and Δ = η∞/m. Unlike the conventional process of node generation in ∑j=n
PINN, we do not generate the nodes randomly. There are m + 1 nodes zi = bi + j=1
a k
j .wi.j (48)
along η-axis. The first and the last nodes are labeled as η0 and ηm which
are considered as boundary nodes where the boundary conditions at The bias term (bi) is an extra neuron with a value of one, added to the
zero and infinity are applied while the nodes in between, ηi, where i = 1, input and hidden layers only and not the output layer, and its role is to
2, ⋯, m − 1, are collocated nodes distributed along η-axis and are decrease the bias of estimated values compared to the desired ones. In
responsible for satisfying the ODEs. The leftmost layer (single input Eq. (48), zi might be thought of as a conventional regression analysis
neuron in Fig. 2B) is passive and responsible for holding the raw input expressed by f(x) = αx + β, such that α and β are analogous to weights
data (ηi). and biases terms in neural network computations. The bias neuron in a
The next step is to build a function that maps the input data xi = ηi to layer (e.g., k − 1) is linked to all the neurons of the next layer (k) except
[ ]
the output data ̂y i = fi (η) , θi (η) which are the solutions of ODEs. The its bias neuron. The computed value (zi), then passes through an acti
function ̂y i = F (xi , w, b) consists of some learning parameters termed vation function to be restricted in a specific range. The function used in
weights (w) and biases (b) which required to be trained to result in an this study is the hyperbolic tangent activation function:
acceptable function approximation, such that ̂ y i (estimated value to
Output (i) = tanh(zi ) (49)
5
H. Bararnia and M. Esmaeilpour International Communications in Heat and Mass Transfer 132 (2022) 105890
Fig. 3. Results corresponding to the Case (I): convective flow over a flat plate. The estimated results, obtained by PINN (300, 6) for (A) the flow field (Eq. (10)), and
(B) the temperature and temperature gradient distribution for different Pr numbers (Eq. (11)). (C) presents the effect of learning rate on decreasing the loss function
during the training process, and (D) highlights the role of the width of the network on addressing the extended infinity boundary condition.
The information is processed in a forward direction (connections question by proposing a loss function based on the ODEs. The logic
[ ]
between neurons do not make a loop), and at each neuron (unit), the behind this definition is that the output data ̂ y i = fi (η) , θi (η) are sup
computations are performed to obtain the output (zi). The procedure posed to satisfy the ODEs. Therefore, loss function is defined based on
continues layer by layer (Forward pass) until the last layer, which does the concept of how well (̂ y i ) is likely to satisfy the ODEs. Unlike Lagaris
not use any activation function, and the value of zi obtained at each node et al. [27], in their proposed method, estimated network output is
in this layer is directly counted as the estimated solution (̂
y i ). responsible for both satisfying ODEs and meeting the corresponding
Since the procedure initiated by the guessed values of weights and boundary conditions. Accordingly, the total loss function for solving an
biases, it is expected to see a significant discrepancy between ̂ y i and yi. ODE is divided into: 1) the loss function attributed to the boundary
The resulted loss function needs to be minimized, which is impossible conditions (Neumann and Dirichlet), and 2) the loss function related to
unless the values of weights and biases should be tuned. Finding the satisfying the ODE itself. In Fig. 2B, four loss functions are shown. The
weights and biases which leads the loss function close to zero is called an first two represent the loss functions related to two coupled ODEs (flow
optimization process. The essence of the optimization process is to and thermal) equations (Rf and Rθ). The loss function attributed to the
determine the gradient of the loss function with respect to any weights Dirichlet and Neumann boundaries are expressed with (RD and RN),
and biases of the network, which is accomplished through the chain rule respectively. The subscripts f and θ indicate whether the corresponding
and a backpropagation algorithm. We used the Adaptive Moment boundary belongs to the thermal or flow equation. Below, we derived
(Adam) optimizer to tune the weights and biases such that the loss these loss functions and the overall loss function for the three case
function is decreased. The iterative process (forward-backward cycles studies that undergo an optimation process afterward.
termed iterations) continues until the loss function meets the pre
determined criteria of convergence. Once the training process ends, the I. Blasius-Pohlhausen Flow:
model can predict the unseen samples via a single forward pass step. [ ]/
Note that the expected output (yi) is not available. Therefore, how a ∑m− 1 ( ′ ′ 1 ′ )2
′ ′
Rf = f + fi fi (m − 1)
loss function can be defined? Raissi et al. [46–48]2 addressed this
i
i=1 2
2
https://github.com/maziarraissi/PINNs.
6
H. Bararnia and M. Esmaeilpour International Communications in Heat and Mass Transfer 132 (2022) 105890
Table 1
L 2 - error between the estimated and the numerical solution of f(η), f′ (η) for different number of hidden layers and different number of neurons per layer. The value of
infinity and mesh size are η∞=5.0 and Δ = 0.05, respectively.
Neurons
20 50 100 150
trends (converged solution) but with low accuracy, indicating that the
issue is not related to the network structure. Instead, we observe that
considering the infinity boundary at η∞=5.0 is not a proper guessed
[ ]/ value since the numerical solution yields to one and zero for (f′ and θ),
∑m− 1 ( ′ ′ )2 respectively, when the independent variable η approaches to the value
Rθ = θi′ + 3Prf i θi (m − 1)
i=1 greater than 5.0, signifying that arbitrary value of (η∞) for boundary
layer problems should be selected carefully to ensure that we are far
( ′ )2 ( ′ )2 beyond the boundary layer thickness, specifically when dealing with
R Df : (f0 )2 ; R Nf : f0 + fm ; R Dθ : (θ0 − 1)2 + (θm )2 (52)
equations with different boundary layer thicknesses (discussed later).
The total loss for each case is the summation of all these losses: As the second try towards reaching the higher accuracy, the position
of the guessed value is extended to η∞= 8.0. However, it is found that the
Loss = R f + R θ + R Df + R Nf + R Dθ (53)
solution does not converge and the loss function fluctuates around 10− 2
Therefore, each training point should meet a specific condition (Fig. 3D), declaring that the ODE is not satisfied by this network struc
depending on whether it is located on the boundaries or inside the ture (20, 2), resulting in an incorrect result (Fig. 3A- dashed lines). The
domain. Recall that minimizing the total loss function to find the proper predominant inference is that the network structure requires a modifi
weight coefficients requires a forward pass followed by a back cation with regard to depth and width when the domain’s boundary is
propagation step termed an iteration (epoch). The loss function (error) is extended and the learning rate parameter is sufficiently low. In this
calculated at each epoch, indicating how well the necessary conditions regard, one of the crucial questions, for which no conclusive deduction
are met. Also, all the derivatives that exist in the loss function are has been proposed yet, is which of these approaches is more beneficial,
calculated using a method known as automatic differentiation available increasing the depth or widening the layers [40]. For better under
in the TensorFlow package. standing these effects, the network structure is limited to a fixed number
The optimizer updates the weights and biases during the training to of layers D = 2 while the number of neurons is gradually increased,
decrease the overall lost function at each epoch. As a result, a reliable varying from w = 20 to 120. It is found that increasing the width to 30
network shows a decrease in loss value as per epoch. The training con neurons and above gives rise to a converged solution when the depth is
tinues until the convergence criterion is achieved (the difference of the fixed at D = 2. Hence, increasing the width indeed resolves the problem
loss value between the two successive epochs falls below a threshold, e. mentioned above. However, it might be thought that considering the
g., 10− 5). units more than 30, in this case, would always result in a converged
solution even if the number of layers is increased. But, this is not always
the case. In other words, the question that can be posed here is whether
7
H. Bararnia and M. Esmaeilpour International Communications in Heat and Mass Transfer 132 (2022) 105890
Table 2
L 2 - error between the estimated and the numerical solutions of f(η), f′ (η) for different number of hidden layers and different range of neurons per
layer. The value of infinity and mesh size are considered as η∞=8.0 and Δ = 0.05, respectively. Note that (− ) declares a range of examined neurons
at each coloumn and the tabulated data are attributed to the number of neurons written in the parenthesis. Also, there are two exceptional sets (30,
4, 6, 130) such that the latter does not show a converged solution while the former one results in acceptale results.
Table 3
L 2 - error between the PINN and the numerical solution of θ(η) for different number of neurons versus different Prandtl numbers when the depth
is fixed at D = 6. The value of inifinity and mesh size are considered as η∞=8.0 and Δ = 0.05, respectively.
increasing the number of layers should be accordingly followed by coupled ODEs, and only one network is responsible for solving both
increasing the width of the network, or one can increase the number of these ODEs simultaneously, the required structures to obtain the
layers while the width is fixed. acceptable estimated solutions are likely to be different from those
Table 2 provides insight into this finding, dictating a minimum width mentioned in Tables 1 and 2. Besides, the Prandtl (Pr) number in Eq.
which below that the network cannot precisely estimate the results, e.g., (11) enhances the nonlinearity, and its effect needs to be examined
(w~80, at, D = 4). It is clearly shown that although 80 neurons are separately to solve the coupled Eqs. (11− 12) with their corresponding
sufficient for two and four layers structure, however, increasing the boundary conditions.
number of layers to six requires adding more neurons (w > 100, e.g., 110 At first, the structure is set to (150, 6) because of its demonstrated
neurons) to satisfy the ODE (Eq. (10)). Finding a relation between the capability to solve Eq. (10). Table 3 exhibits that this set can accurately
number of neurons and layers to obtain a reliable model is out of the solve the thermal equation at Pr =1. However, the network is not trained
focus of this study. However, we should mention that the provided well at higher Prandtl numbers (10 and 100), and the equations are not
conclusion is not general and is limited to this problem (Blasius flow). In satisfied since the Pr number enhances the ODE’s complexity. Interest
Table 2, the highlighted yellow cells show the wrong solutions and (¡) ingly, adding more neurons does not help the network to get trained well
specifies a range of neurons examined at each layer. The data at each cell at this range of Prandtl number unless the width is significantly
is attributed to the number of neurons written in the parenthesis (bold increased to w = 300. (Note that we did not examine the values between
number). It is found that w = 150 is sufficient for the networks even by 250 and 300, and the increment value was chosen as 50). Since w = 300
eight layers (D = 8), and the results show higher accuracy for D = 6 is sufficient to address the problem arising from the high nonlinear terms
[L 2 -error: (2.13E-4, 1.62E-4 and 1.25E-3) for f(η), f′ (η) and f"(η) induced by increasing the Pr number, (300, 6) is selected as a suitable
respectively]. Therefore, (150, 6) is selected, and hereafter, we continue structure to solve all the coupled ODEs related to the cases (I) and (II)
the analysis by this structure. The two conclusions might be drawn from investigated in this study. The results obtained in Fig. 3B depict the
Table 2, including: 1) at a fixed number of layers, adding the neurons is temperature and temperature gradient versus η, for different Pr
likely to resolve the convergence issue, and 2) seems that there is a numbers. The results agree well with those obtained by the numerical
correlation between the number of layers and neurons such that adding method and clearly show that an increase in the Pr number leads to a
more layers requires considering extra neurons at each layer. Fig. 3D decrease in thermal boundary layer thickness and sharp temperature
highlights this finding by showing how the loss values are a function of gradient at the wall, following the Pr definition, which is the ratio of the
neurons for D = 6. The dashed lines in Fig. 3A correspond to the hydrodynamic to thermal boundary layer thickness. Although the loss
particular sets of neurons and layers (w, D) resulted in the wrong esti function is around ~10− 5, however, the L 2 - errors are comparatively
mations whose loss functions do not fall with respect to increasing the large at Pr = 10 and 100.
epochs and are not trained correctly (nearly straight lines in Fig. 3D or So far, we solved the first case study and tried to examine the
the yellow cells in Table 2). We also examined the role of learning rate network structure leading to acceptable results. The second problem in
hyperparameter. Fig. 3C demonstrates that the acceptable limit to the category of the boundary layer concept is the Falkner-Skan wedge
obtain accurate results achieved when λ ≤ 10− 4. As it is expected, a flow (Eqs. (28)-(29)). The ODEs are similar to the Blasius flow. An
higher value of λ leads to overshooting, such that the network is unable additional term in Eq. (28) represents the effect of wedge angle on the
to find the minimum loss function. flow field and its subsequent consequence on thermal distribution (Eq.
It should be emphasized that the results tabulated in Table 2 are (29)). Unlike the previous case, we set the infinity at η∞=6.0 since
corresponding to the solution of Eq. (10). Given that Eqs.(10–11) are indeed it is above the boundary layer thickness, as can be seen in Fig. 4
8
H. Bararnia and M. Esmaeilpour International Communications in Heat and Mass Transfer 132 (2022) 105890
Fig. 4. Results corresponding to the Case (II): Falkner-Skan wedge flow. (A) and (B) are the estimated results obtained by PINN (300, 6) for the flow field. (C) and (D)
are the temperature and temperature gradient distribution versus different Pr numbers and wedge angles.
required to decrease the loss function and close the solution. Fig. 4
Table 4 demonstrates the flow and temperature distribution at different wedge
L 2 - error between the estimated and the numerical solution. The first row angles and Pr numbers. The results show a good match with numerical
shows Falkner-Skan wedge flow related errors, and other rows show errors
data. It is worth pointing out that the above results could also be ach
attributed to the Natural convection problem. All the values deduced from a (10,
ieved by employing only two layers instead of six layers (in both Cases I
2) network excep the last row (Pr = 5) where (10, 4) was used.
& II), but the minimum requirement in terms of the number of neurons
f f′ f′ ′ θ θ′ needs to be met. The minimum requirement for obtaining a reliable
β = 0.25, Pr = 100 9.76E- 2.52E- 2.07E- 6.58E- 1.61E- solution for the Blasius flow (Pr = 100, β=0.0) and Falkner-Skan at (Pr
04 04 03 02 01 = 100, β=0.25), with only two layers, are (200,10), respectively. At this
Pr = 0.5 4.53E- 5.13E- 1.75E- 9.68E- 4.86E-
high magnitude of Prandtl number, the noticeable effect of angle is
03 03 03 04 04
Pr = 1.0 1.70E- 1.93E- 1.61E- 5.50E- 6.81E- significant, and it is not limited to D = 2 such that the relative reduction
03 03 03 04 04 is also expected for D = 6. A comparison with Table 3 shows that the
Pr = 5.0 1.99E- 2.85E- 1.63E- 1.00E- 2.00E- required number of neurons with six layers for these two cases is
03 03 03 03 03
(300,200).
The last problem is related to the natural convection phenomenon
(Note that one can also take η∞= 8.0). Interestingly, a (200, 6) network (Eqs. (41–42)). Unlike the previous cases, where the flow changes the
could also satisfy the ODEs with the L 2 - errors in the acceptable range, temperature distribution, the flow is induced by a temperature gradient
e.g., at Pr = 100 and β = 0.25, the corresponding errors are (3.03E-02, which results in fully coupled ODEs system. Decreasing the Pr number
3.48E-02, 7.15E-04, 8.11E-04 and 1.25E-03) for θ, θ′ , f, f′ , f", respec leads to an increase in thermal boundary layer thickness such that at Pr
tively. The reason behind this considerable discrepancy stems from the = 0.2, an acceptable guess for infinity boundary should be approxi
effects of the shorter boundary value (η∞=6.0) and the wedge angle (β) mately beyond (η∞ > 10.0). Therefore, at a low Pr number, one must
such that when β approaches zero (horizontal flat plate), ODEs (28–29) carefully adjust the infinity value. Here, we limit the solution to the Pr
become similar to the ODEs (10–11), and therefore more neurons are numbers ranging from 0.5 to 5. Similar to the previous cases, a network
9
H. Bararnia and M. Esmaeilpour International Communications in Heat and Mass Transfer 132 (2022) 105890
Fig. 5. Results corresponding to the Case (III): Natural convection. (A) and (B) are the estimated results obtained by PINN (10, 4) for the flow field. (C) and (D) are
the temperature and temperature gradient distribution versus different Pr numbers.
Fig. 6. Cross-validation tests of the PINN model for Blasius problem. (A) K-fold cross-validation’s results at Pr = 1 and (B) predicted results obtained by hold-out
technique at Pr = 10.
10
H. Bararnia and M. Esmaeilpour International Communications in Heat and Mass Transfer 132 (2022) 105890
Table 5 revealed that, according to the number of hidden layers that construct a
L 2 - errors corresponding to the cross-validation (K-fold) and hold-out methods. network, there was a minimum requirement in terms of the number of
The (K-fold) method is applied to Blasius problem at Pr = 1 and hold-out method neurons at each layer, below which the model was unable to predict the
at Pr = 10. solutions. Also, increasing the Prandtl number magnified the complexity
CV-1 CV-2 CV-3 CV-4 CV-5 Hold-out such that more neurons were required for nonlinear data mapping. The
method possibility of the overfitting issue was also discussed through two cross-
f 1.37E- 3.76E- 6.84E- 2.67E- 7.24E- 1.57E-03 validation techniques, and it was found that the predicted test data by
03 04 04 03 04 PINN matched reasonably well with the numerical results.
θ 8.90E- 4.03E- 6.83E- 2.93E- 1.62E- 1.52E-01
03 03 03 02 03
Credit author statement
11
H. Bararnia and M. Esmaeilpour International Communications in Heat and Mass Transfer 132 (2022) 105890
[14] Z. Du, M.S. Selig, The effect of rotation on the boundary layer of a wind turbine [37] A. Oishi, G. Yagawa, Computational mechanics enhanced by deep learning,
blade, Renew. Energy 20 (2) (2000) 167–181, https://doi.org/10.1016/S0960- Comput. Methods Appl. Mech. Eng. 327 (2017) 327–351, https://doi.org/
1481(99)00109-3. 10.1016/j.cma.2017.08.040.
[15] R. Eldan, O. Shamir, The power of depth for feedforward neural networks, in: 29th [38] F. Olaiya, A.B. Adeyemo, Application of data mining techniques in weather
Annual Conference on Learning Theory, PMLR, 2016, pp. 907–940. prediction and climate change studies, Int. J. Inform. Eng. Electr. Bus. 4 (1) (2012)
[16] M. Esmaeilpour, D.D. Ganji, Application of He’s homotopy perturbation method to 51–59, https://doi.org/10.5815/ijieeb.2012.01.07.
boundary layer flow and convection heat transfer over a flat plate, Phys. Lett. A [39] S. Ostrach, An analysis of laminar free-convection flow and heat transfer about a
372 (1) (2007) 33–38, https://doi.org/10.1016/j.physleta.2007.07.002. flate plate parallel to the direction of the generating body force, NACA Rep. 1111
[17] V.M. Falkner, S.W. Skan, Some approximate solutions of the boundary layer (1953).
equations, Philos. Mag. 12 (1931) 865–896, https://doi.org/10.1080/ [40] G. Pandey, A. Dukkipati, To go deep or wide in learning?, in: Proceedings of the
14786443109461870. Seventeenth International Conference on Artificial Intelligence and Statistics,
[18] B.G. Goh, N.O. Hodas, A. Vishnu, Deep learning for computational chemistry, PMLR vol. 33, 2014, pp. 724–732.
J. Comput. Chem. 38 (16) (2017) 1291–1307, https://doi.org/10.1002/jcc.24764. [41] L. Peng, S. Liu, R. Liu, L. Wang, Effective long short-term memory with differential
[19] I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, MIT press, 2016. evolutionalgorithm for electricity price prediction, Energy 162 (2018) 1301–1314,
[20] I. Grant, X. Pan, The use of neural techniques in PIV and PTV, Meas. Sci. Technol. 8 https://doi.org/10.1016/j.energy.2018.05.052.
(12) (1997) 1399–1405, https://doi.org/10.1088/0957-0233/8/12/004. [42] H.A. Pierson, M.S. Gashler, Deep learning in robotics: a review of recent research,
[21] K. Hornik, M. Stinchcombe, H. White, Multilayer feedforward networks are Adv. Robot. 31 (16) (2017) 821–835, https://doi.org/10.1080/
universal approximators, Neural Netw. 2 (5) (1989) 359–366, https://doi.org/ 01691864.2017.1365009.
10.1016/0893-6080(89)90020-8. [43] L. Prandtl, Über Flussigkeitsbewegung bei sehr kleiner Reibung. Verhandl. III,
[22] L. Howarth, On the solution of the laminar boundary layer equations, Proc. R. Soc. Internat. Math.-Kong., Heidelberg, Teubner, Leipzig vol. 2, 1904, pp. 484–491.
Lond. Ser. A 164 (919) (1938) 547–579, https://doi.org/10.1098/rspa.1938.0037. [44] S. Raayai-Ardakani, G.H. McKinley, Drag reduction using wrinkled surfaces in high
[23] D. Izzo, M. Märtens, B. Pan, A survey on artificial intelligence trends in spacecraft Reynolds number laminar boundary layer flows, Phys. Fluids 29 (9) (2017),
guidance dynamics and control, Astrodynamics 3 (4) (2019) 287–299, https://doi. 093605. http://hdl.handle.net/1721.1/119866.
org/10.1007/s42064-018-0053-6. [45] M.T. Rad, A. Viardin, G.J. Schmitz, M. Apel, Theory-training deep neural networks
[24] W.M. Kays, M.E. Crawford, Convective Heat and Mass Transfer, 3rd edition, for an alloy solidification benchmark problem, Comput. Mater. Sci. 180 (2020),
McGraw–Hill, New York, 1993. 109687, https://doi.org/10.1016/j.commatsci.2020.109687.
[25] M.F. Kelly, P.A. Parker, R.N. Scott, The application of neural networks to [46] M. Raissi, P. Perdikaris, G.E. Karniadakis, Physics informed deep learning (part I):
myoelectric signal analysis: a preliminary study, IEEE Trans. Biomed. Eng. 37 (3) data-driven solutions of nonlinear partial differential equations, arXiv Prepr.
(1990) 221–230, https://doi.org/10.1109/10.52324. (2017) arXiv:1711.10561.
[26] H.K. Kuiken, Free convection at low Prandtl number, J. Fluid Mech. 37 (4) (1969) [47] M. Raissi, P. Perdikaris, G.E. Karniadakis, Physics-informed neural networks: a
785–798, https://doi.org/10.1017/S0022112069000887. deep learning framework for solving forward and inverse problems involving
[27] I.E. Lagaris, A. Likas, D.I. Fotiadis, Artificial neural networks for solving ordinary nonlinear partial differential equations, J. Comput. Phys. 378 (2019) 686–707,
and partial differential equations, IEEE Trans. Neural Netw. 9 (5) (1998) https://doi.org/10.1016/j.jcp.2018.10.045.
987–1000, https://doi.org/10.1109/72.712178. [48] M. Raissi, A. Yazdani, G.E. Karniadakis, Hidden fluid mechanics: learning velocity
[28] Y. Lian, W. Shyy, Laminar-turbulent transition of a low Reynolds number rigid or and pressure fields from flow visualizations, Science 367 (6481) (2020)
flexible airfoil, AIAA J. 45 (7) (2007) 1501–1513, https://doi.org/10.2514/ 1026–1030, https://doi.org/10.1126/science.aaw4741.
1.25812. [49] F. Sarghini, G. De Felice, S. Santini, Neural networks based subgrid scale modeling
[30] S.J. Liao, An approximate solution technique not depending on small parameters: a in large eddy simulations, Comput. Fluids 32 (1) (2003) 97–108, https://doi.org/
special example, Int. J. Non-Lin. Mech. 30 (3) (1995) 371–380, https://doi.org/ 10.1016/S0045-7930(01)00098-6.
10.1016/0020-7462(94)00054-E. [50] E. Schmidt, W. Beckmann, Das Temperatur-und Geschwindigkeitsfeld vor einer
[31] S. Liao, A. Campo, Analytic solutions of the temperature distribution in Blasius Wärme abgebenden senkrechten Platte bei natürlicher Konvektion, Tech. Mech.
viscous flow problems, J. Fluid Mech. 453 (2002) 411–425, https://doi.org/ Thermodyn. 1 (11) (1930) 391–406, https://doi.org/10.1007/BF02660553.
10.1017/S0022112001007169. [51] P.A. Srinivasan, L. Guastoni, Schlatter Azizpour, P., Vinuesa, R., Predictions of
[32] J. Ling, A. Kurzawski, J. Templeton, Reynolds averaged turbulence modelling turbulent shear flows using deep neural networks, Phys. Rev. Fluids 4 (5) (2019),
using deep neural networks with embedded invariance, J. Fluid Mech. 807 (2016) 054603, https://doi.org/10.1103/PhysRevFluids.4.054603.
155–166, https://doi.org/10.1017/jfm.2016.615. [52] C.L. Teo, K.B. Lim, G.S. Hong, M.H.T. Yeo, A neural net approach in analyzing
[33] S.X. Lv, L. Peng, L. Wang, Stacked autoencoder with echo-state regression for photograph in PIV, in: Proceeding IEEE International Conference on Systems, Man
tourism demand forecasting using search query data, Appl. Soft Comput. 73 (2018) and Cybernetics, Charlottesville, VA, USA vol. 3, 1991, pp. 1535–1538, https://
119–133, https://doi.org/10.1016/j.asoc.2018.08.024. doi.org/10.1109/ICSMC.1991.169906.
[34] M. Milano, P. Koumoutsakos, Neural network modeling for near wall turbulent [53] T. Wang, Z. Huang, Z. Sun, G. Xi, Reconstruction of natural convection within an
flow, J. Comput. Phys. 182 (1) (2002) 1–26, https://doi.org/10.1006/ enclosure using deep neural network, Int. J. Heat Mass Transf. 164 (2021),
jcph.2002.7146. 120626, https://doi.org/10.1016/j.ijheatmasstransfer.2020.120626.
[35] W. Munters, J. Meyers, An optimal control framework for dynamic induction [54] T. Xie, S.M. Ghiaasiaan, S. Karrila, Artificial neural network approach for flow
control of wind farms and their interaction with the atmospheric boundary layer, regime classification in gas–liquid–fiber flows based on frequency domain analysis
Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 375 (2091) (2017) 20160100, of pressure signals, Chem. Eng. Sci. 59 (11) (2004) 2241–2251, https://doi.org/
https://doi.org/10.1098/rsta.2016.0100. 10.1016/j.ces.2004.02.017.
[36] H. Mutuk, A neural network study of Blasius equation, Neural. Process. Lett. 51 [56] N. Zobeiry, K.D. Humfeld, A physics-informed machine learning approach for
(2020) 2179–2194, https://doi.org/10.1007/s11063-019-10184-9. solving heat transfer equation in advanced manufacturing and engineering
applications, Eng. Appl. Artif. Intell. 101 (2021), 104232, https://doi.org/
10.1016/j.engappai.2021.104232.
12