Abstract
Modeling self-gravitating gas flows is essential to answering many fundamental questions in astrophysics. This spans many topics including planet-forming disks, star-forming clouds, galaxy formation, and the development of large-scale structures in the Universe. However, the nonlinear interaction between gravity and fluid dynamics offers a formidable challenge to solving the resulting time-dependent partial differential equations (PDEs) in three dimensions (3D). By leveraging the universal approximation capabilities of a neural network within a mesh-free framework, physics informed neural networks (PINNs) offer a new way of addressing this challenge. We introduce the gravity-informed neural network (GRINN), a PINN-based code, to simulate 3D self-gravitating hydrodynamic systems. Here, we specifically study gravitational instability and wave propagation in an isothermal gas. Our results match a linear analytic solution to within 1% in the linear regime and a conventional grid code solution to within 5% as the disturbance grows into the nonlinear regime. We find that the computation time of the GRINN does not scale with the number of dimensions. This is in contrast to the scaling of the grid-based code for the hydrodynamic and self-gravity calculations as the number of dimensions is increased. Our results show that the GRINN computation time is longer than the grid code in one- and two- dimensional calculations but is an order of magnitude lesser than the grid code in 3D with similar accuracy. Physics-informed neural networks like GRINN thus show promise for advancing our ability to model 3D astrophysical flows.

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
1. Introduction
The field of scientific machine learning (SciML), lying at the junction of data-based modeling and physics, has shown substantial potential for solving problems across science and engineering domains [2, 3, 5, 6, 11, 18]. Many phenomena involving space and/or time variations are modeled using partial differential equations (PDEs). Efforts to solve such PDEs have led to the development of scientific computing techniques and numerical approaches like finite element (FE) [9, 23], finite difference (FD) [24], and finite volume (FV) [16] methods. While these traditional solvers are powerful, they often require substantial computational resources or some restrictive assumptions, particularly for three-dimensional, multiscale problems. In particular, their inability to resolve a wide enough range of length or time scales has thwarted the current progress in many areas of astrophysics, including simulations of star and galaxy formation. In recent years, physics-informed neural networks (PINNs) have emerged as an exciting prospect in applying neural networks to solve both ordinary and PDEs [12, 42]. This has provided an alternative cohesive framework for solving forward/inverse problems [8, 42, 52] and surrogate modeling [54] driven by PDEs. PINNs explicitly incorporate the dynamical equations during the training process, thus ensuring that all constraints and conservation laws governing the system are satisfied. This allows the use of PINNs for accurate modeling of various phenomena in fluid dynamics [7, 43], structural mechanics [19, 20, 53], and electromagnetism [17, 39].
The self-gravity of (dark or visible) matter is the essential force that drives the dynamics and development of the cosmic web of structures [46], the formation of galaxies [50] and stars [44], as well as the instabilities within protoplanetary disks [27, 51]. Our goal is to take the first steps toward using PINNs to study the evolution of self-gravitating hydrodynamic systems. Here we apply this framework to the study of waves and instabilities within interstellar molecular gas clouds. These are the sites of current-day star and planet formation. Interstellar molecular gas is typically modeled using a set of gravito-hydrodynamic (GHD) equations. Disturbances that are large enough can become unstable due to self-gravity, resulting in local collapse (Jeans instability) [25]. This leads to the creation of pockets of localized high density within the fluctuating background density. The nonlinear influence of self-gravity leads to a wide dynamic range in the density of the gas. Solving such a system analytically requires imposing various limiting and often drastic assumptions. Alternatively, one can apply numerical methods to more complex equations using approximations like the FD method [24]. However, with these methods, the large dynamic range of variation makes it challenging to retain numerical resolution, particularly in high-density regions. Even if such high-density regions are resolved, the required time step can be drastically reduced due to stability or accuracy criteria. This makes the modeling of long-term evolution very challenging as well as computationally expensive. For example, simulations of galaxy formation are typically limited by the need to model star formation and stellar feedback as sub-grid processes [38]. Furthermore, three-dimensional simulations of star-forming collapse that resolve the inner protostar and disk region are severely constrained. Even the most advanced codes can reach only up to yr past protostar formation [31], whereas observed protostars are in the age range
–106 yr. While still in the early stages of development, PINNs provide an alternate pathway that may eventually address some of these computational challenges.
In this paper, we develop a PINN-based PDE solver for a three-dimensional (3D) GHD system consisting of a set of coupled time-dependent PDEs. PINNs provide a mesh-free framework where the resolution of the GHD simulations is not limited by the finite spacing of a grid [12]. In the FD approach, increasing dimensionality increases memory requirements geometrically, and increased resolution requires smaller time stepping in order to maintain accuracy or stability. Together these dramatically enhance the computation cost for 3D calculations with a high enough resolution to achieve sufficient fidelity. However, PINNs can adapt to varying resolutions in a more flexible way and will not necessarily follow these constraints. Owing to the universal approximation capabilities of neural networks [22], PINNs have the potential to capture the nonlinear interaction between fluid dynamics and gravity with high accuracy. Moreover, the traditional FD-based approach can suffer from accuracy and stability issues when applied to nonlinear systems that contain small-scale structures and discontinuities (e.g. a shock front). PINN-based models have particularly demonstrated their ability in dealing with PDEs whose solutions can develop shocks. [10, 32].
The paper is organized as follows, we discuss the hydrodynamic system of interest in section 2 and introduce the PINN algorithm in section 3. In section 4 we give three case studies demonstrating the application of GRINN and compare the results with linear theory and FD methods. We discuss the limitations and potential extensions in section 5 and summarize our findings in section 6.
2. Hydrodynamic system
We solve the system of isothermal self-gravitating hydrodynamics, which is of interest in the study of interstellar molecular clouds, the sites of current-day star formation. These are dark cold regions of molecular (mostly H2) gas in which the self-gravity makes density perturbations unstable. This can trigger a local collapse resulting in the growth of dense cores within the gas clouds that collapse further to form stars. The initial conditions that trigger the collapse of molecular clouds to form stars have been studied for decades [33, 34, 37, 44, 49]. The simplest case of interest is to investigate the growth of gravitational instability in an isothermal gas while accounting for thermal pressure and gravitational effects. Since the radiative cooling time is shorter than the collapse time in the early stages, the gas is close to isothermal until well into the final collapse that produces a star.
The behavior of the interstellar gas is governed by a set of fundamental hydrodynamic equations. We solve the three-dimensional (3D) hydrodynamic equations including self-gravity to study the gravitational (or Jeans [25]) instability in star-forming molecular clouds. If ρ and v are the gas density and velocity respectively we can express the governing equations as



Equations (1)–(3) are the equations of mass continuity, momentum, and self-gravity (the Poisson equation), respectively. The gravitational field , where φ is the gravitational potential given in equation (3) and G is the gravitational constant. We close the above set of equations with the generalized polytropic relation
. We consider isothermal evolution (γ = 1) of an ideal gas in which
, where T is the temperature, m is the mean mass of molecules, and hereafter we identify
as the isothermal sound speed. This coupled set of equations has a diverse set of solutions depending on the initial and the boundary conditions of the system of interest. In this paper, we apply sinusoidal perturbations to the background steady-state fluid and study the behavior under the influence of self-gravity.
Solving equations (1)–(3) under a linear approximation (appendix , where ρ0 is the uniform background density. For any perturbation with wavelength
the amplitude grows exponentially in time. The overdense region is unstable and goes into a runaway collapse. For
, the system is stable and exhibits local oscillatory behavior as well as wave propagation.
Here we consider both linear and nonlinear initial sinusoidal perturbations. Linear perturbations are small amplitude () waves while nonlinear perturbations are large amplitude (
) disturbances to the uniform background density ρ0. This fluid system has known analytic solutions, letting us probe the effectiveness of the PINN architecture introduced in the next section for solving the GHD equations.
3. PINNs architecture
A PINN is a neural network configured to simulate a dynamical system governed by PDEs. This is done by approximating the PDE's solutions by training the neural network to minimize a loss function (or the residual). Here
has parameters θ (weights and biases of the neurons) and is defined over the space-time domain (d spatial and one temporal dimension) denoted by collocation points
. These collocation points are a set of points sampled from the domain of integration and are given as input to the model. Instead of directly solving the equations, the PINN's architecture solves the PDEs as a loss function optimization problem. The novel aspect of PINNs is the incorporation of a residual network that encodes the PDEs along with the boundary and initial conditions. In particular, the loss function is reinforced with a residual term originating from the governing equations and this acts as a penalty term that quantifies the deviations from the ground truth solution of the PDEs. For the forward problem, we train PINNs by an unsupervised learning approach based solely on the physical equations as well as the boundary/initial conditions without the aid of any labeled data (i.e. no additional observational or simulation data are needed as input).
In this section, we introduce GRINN, a PINN-based model used for simulating the hydrodynamic system in the presence of self-gravity. Consider a parametrized differential equation in a general form

defined on the domain with the boundary
. Here
is the nonlinear differential operator, f(X) is a source term, Λ is a vector of additional parameters of the differential operator,
is a set of boundary/initial conditions related to the problem and g is the boundary function.
Our system of interest involves studying the dynamics of the self-gravitating gas. The governing equations (1)–(3) are coupled to each other where the gas density ρ, velocity
v
, and the gravitational potential φ are dependent variables. We begin with a single neural network and approximate the solutions ρ,
v
, and φ of the PDEs as the outputs of the network,

In this approach, the total loss is defined as the sum of the PDE residual, the boundary, and initial condition residuals at the collocation points X for each of the output variables. The PDE's residuals associated with equations (1)–(3) for a (d + 1)-dimensional system are given as



The MSE corresponding to the residuals computed in equations (6)–(8) for a set of randomly generated collocation points sampled from a uniform distribution in 3+1 dimensions is given as

where Nr
is the total number of collocation points in the space-time domain. The MSE associated with the boundary and initial conditions for density ρθ
, velocity and gravitational potential φθ
at randomly generated collocation points
and
are


Here, Nb
and N0 are the total number of collocation points on the boundary and at the initial time, respectively. The summation over j takes into account the velocity components () in the three spatial dimensions for a (
) dimensional system. The training objective is to minimize the total loss
originating from the residuals of the PDE and the boundary and initial conditions. This is done by optimizing the neural network parameters θ to yield

Once trained, the PINN model can accurately approximate the solutions of the PDEs. The training process (i.e. optimization of the network parameters) works iteratively, where the derivatives of the outputs are taken with respect to the network parameters in order to compute the MSEs (as given in equations (9)–(11)) and adjust θ in each step to minimize the MSE. To ensure the differentiability of the output, which is the solution approximated by the neural network, a smooth activation function is used (e.g. tanh, sigmoid, sine). PINNs exploit the auto-differentiation [4] technique for estimating the derivative of the outputs and eventually obtaining the PDE solutions. Furthermore, numerical methods such as FD show inefficiency when applied to complicated nonlinear functions. Automatic differentiation overcomes these limitations and shows no approximation (truncation) error as well. This enables the network to approximate any PDE along with the given boundary conditions without numerically discretizing the equations, thus not requiring a mesh.
We adopt DeepXDE [30] to design our PINN model GRINN. It is a fully connected neural network with 3 hidden layers having 32 neurons in each layer. Figure 1 illustrates the schematic of the GRINN architecture. Here, the sine activation function is implemented for each neuron. The smoothness and periodicity of the sine activation function enable the network to achieve more stability as well as better training efficiency compared to step functions like a sigmoid [40, 45]. We randomly sample collocation points over the given space-time domain. This acts as an input to the network which computes the residuals in equations (6)–(8) at each of these collocation points. Additionally,
points are sampled for the spatial boundary and at the initial time slice respectively. The parameters θ are initialized using a truncated normal distribution (He-Normal). For the first 2000 epochs, we implement an adaptive stochastic gradient descent-based optimizer (ADAM) with a learning rate of
to pre-train the network. Pre-training the network minimizes the chance of the optimization getting stuck at a local minimum. This is followed by training the network with the second-order quasi-Newton L-BFGS optimizer [29] to converge on the global minima. In this work, we choose the hyperparameters by a trial and error approach which minimizes computational cost without compromising accuracy.
Figure 1. Schematic of the PINN-based GRINN workflow. The output of the fully connected neural network is . These approximate solutions are used along with the PDEs governing the system to obtain the total loss function. During the training process, the parameters θ are optimized iteratively to obtain
. Using this the final solutions of the system are obtained.
Download figure:
Standard image High-resolution image4. Results
In this section, we demonstrate the GRINN approach's effectiveness in solving equations (1)–(3), which govern the gas dynamics in molecular clouds under the influence of gravity and thermal pressure. GRINN estimates the density, velocity, and gravitational fields as functions of space and time over a specified domain and time period. We examine three distinct test problems, in each case comparing the results with solutions obtained using the standard FD method as well as an analytic solution using the linear theory (LT) solution set out in appendix

In section 4.1 we present the evolution of a small-amplitude growing disturbance that is unstable. In section 4.2 we study the effect of a large-amplitude nonlinear disturbance that grows rapidly. Finally, in section 4.3 a small-amplitude stable wave is shown.
4.1. Case 1: growing linear perturbation with
We model the growth of the Jeans instability in a self-gravitating cloud of molecular hydrogen gas filling a cube defined in Cartesian coordinates. GRINN is trained on the governing equations (1)–(3) for initial conditions with a perturbation in density ρ and 1D velocity amplitude v that corresponds to an individual mode under the LT:

where ,
k
is the wavevector, and the mode's growth rate is
, as derived in appendix
Henceforth, we work in a set of units in which . This means that all densities and speeds are normalized to ρ0 and cs
, respectively. The units of time and length are
and
, respectively.
The case 1 initial condition is a linear wave of amplitude with fronts inclined 45∘ in the x − y plane using
. We set the wavelength
(where
in our units). The computational domain spans three of these wavelengths. Periodic boundary conditions (BCs) are enforced in all three directions for ρ and
v
. For example, along the x-direction,

where xm
is the domain length along the x-direction. Since equation (3) is a second-order differential equation, we enforce periodicity for both the gravitational potential φ and its derivative perpendicular to the boundary, so that

The periodic BC ensures that gas leaving the domain through any boundary re-enters from the opposite side.
The training process involves optimizing the model parameters θ by minimizing the total loss in equation (12) for the PDEs equations (1)–(3), initial conditions equation (14), and BCs equations (15)–(16). The minimization yields 3D solutions for the gas density, velocity, and gravitational field as functions of time over the selected interval. The density result for case 1 at time t = 2 is in figure 2 on the left. The lower panel shows the mismatch between the GRINN and FD solutions. Figure 3 is an x − y cross-section through the domain at z = 0.6 and t = 2. For the FD calculations in 3D, we choose the Courant number ν = 0.5 and the number of grid points
. At the final time t = 3, the Jeans length at the maximum density is resolved with 80 grid points in each direction. Arrows are velocity vectors depicting the direction of flow. For case 1 the gas flows into the overdense regions, further increasing the density with time as the unstable mode grows.
Figure 2. Top: GRINN density solutions for a 3D self-gravitating hydrodynamic system (equations (1)–(3)) for different initial conditions (three distinct cases) as discussed in section 4 at t = 2. Bottom: the relative mismatch (i.e. ε given in equation (13)) between the GRINN and standard FD solutions.
Download figure:
Standard image High-resolution imageFigure 3. Cross-sections through the GRINN density solutions for the three test cases in the x − y-plane at z = 0.6 and t = 2. Velocity vectors are overplotted. The maximum length of the velocity vector corresponds to a velocity of (Case 1),
(Case 2), and
(Case 3).
Download figure:
Standard image High-resolution imageFor further insight into case 1, we apply a simpler initial perturbation along the x-axis, setting and
in equation (14). Cuts along the x-direction at y = 0.6 and z = 0.6 are in figure 4. The top-row image shows the gas density found by GRINN versus x and t. The density grows over time due to the Jeans instability. In the panels below, we show cuts through the solution at the three times marked by red lines in the top-row image: t = 0.5, 1.5, and 2.5, comparing with the corresponding FD and LT solutions. For the FD calculations in 1D, we increase the number of grid points to N = 1000 with Courant number ν = 0.5 for better accuracy. Equation (32) indicates the instability should grow with an e-folding time τ = 2.3, which is approximately borne out as the denser regions become more dense with increasing time. The gas flows out of the less-dense regions at increasing speed and into the density peaks. The gas velocity variation shown in the second row from the bottom is 90∘ out of phase with the density so that zero velocity coincides with the maximum density. The GRINN solutions for both ρ and
v
are consistent with the LT and FD solutions with relative mismatch
at all times shown.
Figure 4. The GRINN solution for case 1. The top-row image shows the growth in the gas density over time due to Jeans instability from GRINN. The next row of panels shows the density profiles from the GRINN, FD, and LT methods at the three times indicated by the red lines in the top-row image. The third row shows the mismatch of the GRINN solution with respect to the FD and LT solutions. The fourth and fifth rows of panels give the corresponding velocity profiles and mismatches. The GRINN solution agrees closely with the FD and LT solutions, and the mismatches are smaller than 1%.
Download figure:
Standard image High-resolution image4.2. Case 2: growing nonlinear perturbation with
We next examine how GRINN performs on a large-amplitude, nonlinear initial perturbation, consisting of a wave with amplitude and wavelength
. We orient the wavevector along the x-axis, with
and
in equation (14).
The case 2 results are shown in figure 2 middle column. The upper plot has the density profile from GRINN at t = 2. The amplitude is greater than in case 1 because of the greater initial perturbation, as indicated by the differing color scales across the figure's columns. The corresponding 2D slice showing the flow directions appears in the middle panel of figure 3. As in case 1, the flow is into the density peaks. The GRINN and FD results agree to within for t < 2. However, the mismatch grows along with the nonlinearity at later times. Figure 5 shows 1D cuts through the growing density and velocity at
. For the FD calculations in 1D, we adjust the Courant number to ν = 0.6 and increase the number of grid points to N = 2000 for improved accuracy. Since the LT is invalid at these amplitudes, we compare our results only with the FD solutions. We note that nonlinear effects shift the maximum infall velocities toward the density peaks, leading to a slight asymmetry in the velocity profile that is evident in both the GRINN and FD solutions.
Figure 5. Same as figure 4 but for case 2, with larger initial density perturbation of amplitude .
Download figure:
Standard image High-resolution image4.3. Case 3: propagating linear perturbation with
To explore GRINN's performance on a propagating disturbance, we take as initial condition a linear solution for the density ρ and velocity
v
using equation (14) once again. We let and
, leading to
by equation (31) in appendix
, the LT indicates wave propagation with no growth, but at a propagation speed less than the isothermal sound speed owing to self-gravity (appendix
and
. The top panel of figure 6 depicts the time evolution of the gas density till one wave crossing time (i.e. t = 8). Additionally, we show snapshots of gas density and velocity profile at three epochs in the lower panels for a cross-section taken at
. The FD and LT solutions are overplotted for comparison. The mismatch ε is less than 0.5% and 1.0% in density and velocity, respectively, at all three times. The FD solution deviates from the LT and GRINN solutions from about time t = 7 onward, in the sense that the amplitude decreases slowly due to numerical diffusion. We adjust the Courant condition to ν = 0.2 and increase the number of grid points to N = 8000 to enhance numerical accuracy. The GRINN solution does not exhibit numerical diffusion and continues to match the linear solution throughout the time period covered.
Figure 6. Same as figure 4 for case 3. The linear perturbation amplitude but wavelength
, such that the solution consists of sound waves propagating stably but slower than usual due to the extra force of self-gravity.
Download figure:
Standard image High-resolution image5. Discussion
Physics-informed neural networks like GRINN open promising new directions for solving 3D astrophysical gas flow problems accurately in a time-efficient way. PINNs work by approximating functions that are global solutions to the target PDEs, an approach quite different from the local linear or quadratic approximations used in FD methods. PINN-based models can be built on top of publicly-available neural network modules like TensorFlow [1] and PyTorch [41], making them easy to develop and maintain. Since these public modules are designed to run on GPUs, PINN-based solvers can be implemented on GPU clusters with little effort, easing further enhancements in efficiency. With the aid of more sophisticated architectures such as XPINNs [13] one can run models like GRINN on multiple GPUs, thus making it highly scalable for problems having a large computational domain.
Gravity in many astrophysical systems concentrates mass into localized regions, with corresponding steep density gradients. For traditional FD solvers, resolving such gradients requires densely-spaced grid points. The high grid resolution furthermore means small time steps are needed to maintain numerical accuracy and stability. This consequently increases the computational expense. In contrast, PINNs are mesh-free and capable of resolving wide variations in the computational quantities (e.g. high-density regions), and capture nonlinearities (resulting in steepening gradients) with only gradual increases in computational cost. This gives the PINN-based PDE solvers an edge over traditional methods for modeling self-gravitating systems.
In the following subsections, we discuss some of the pros and cons of PINNs for astrophysical applications, comparing computational costs between the GRINN and FD codes in section 5.1, examining performance in the nonlinear regime where shocks develop in section 5.2, exploring extrapolation of the solution beyond the time domain over which the network is trained in section 5.3, and discussing some practical issues with the implementation of PINN solvers in section 5.4.
5.1. Computational cost comparison between GRINN and Finite Difference method
Here we compare the computational cost of the PINN-based hydrodynamic solver against the standard FD code. We apply both methods to the small and growing perturbations similar to case 1, tracking the wall clock runtime T, for an integration time t = 7. In the FD calculations, we use N = 300 grid points along each dimension. We run GRINN on an NVIDIA A100 GPU and the FD calculations on a single core of an INTEL Xeon E7 processor.
Figure 7 shows in the left panel the wall clock time T versus the number d of spatial dimensions over which the hydrodynamic system is solved. Under the FD scheme, T increases steeply with d due to the geometric rise in the number of grid points Nd . By contrast, GRINN, being mesh-free, has a computational cost almost independent of d. The slight increase in T with d in figure 7 is due to the extra operations needed to compute the residual since each added dimension brings an additional equation due to an increase in the components of the velocity (equation (2)). The near-independence of wall clock time on the number of dimensions means that GRINN solves the 3D problem in less than one-tenth the time the FD solver needs. Although the wall clock runtime of the FD code could be reduced by running it on a faster CPU, the key point is that the scaling of the computation time with increasing dimensions is quite different for a PINN versus FD.
Figure 7. Left: scaling of the run wall clock time with the number of spatial dimensions for the GRINN and FD codes. Right: normalized 3D run wall clock times for the two codes vs. the problem time interval t. The GRINN model is run twenty times in order to obtain the mean wall clock time and the error bar captures the standard deviation.
Download figure:
Standard image High-resolution imageGRINN also shows better scaling than the FD code with increasing problem integration time t. In the right panel of figure 7, we compare the normalized wall clock times for the same 3D system considered above. The normalized wall clock time is , where
is the computational time for obtaining the solution at t = 1. This eliminates the bias induced in the absolute computational time due to external factors such as the choice of hardware and code efficiency. For the FD case, the number of time steps
(where dt is the discretized time step) increases linearly with the integration time t, so the computational time scales as
. GRINN however being mesh-free solves for the whole space and time domain at once, making the wall clock time almost independent of the integration time.
5.2. Sound wave propagation and shock formation in the absence of gravity
PINN-based models can solve nonlinear PDEs and capture sharp transitions in the solution such as shocks [10, 32]. We explore the potential of the GRINN architecture to capture the growth of shocks, considering a nongravitating system. We initialize using the linear solutions of equations (28) and (29), but without the gravity terms. We examine both linear and nonlinear initial disturbances, with amplitudes and
, respectively.
The two left columns of figure 8 show the density and velocity profiles of the propagating sound wave with linear amplitude () at the two times t = 0.6 and 0.9. The wave propagates at the speed of sound from left to right. The GRINN solution remains close to LT throughout, with a mismatch
. The mismatches in ρ and
v
between the GRINN and FD solutions grow with time and are up to 0.5% at t = 0.9 due to numerical dissipation in the FD scheme.
Figure 8. Left: GRINN solutions for linear perturbation, . The top panel captures the variation in the gas density (at t = 0.6 and t = 0.9) due to the propagation of the sound wave for a non-self gravitating hydrodynamic system. The GRINN solution (cyan line) is compared with FD methods (black) and linear theory (LT) (dashed red) results. The last row gives the percentage relative mismatch ε (equation (13)) of the PINNs solution with respect to FD and LT solutions. Right: GRINN solutions for nonlinear perturbation,
. The steepening of the wave leads to the development of a shock front. The FD solutions are overplotted for comparison.
Download figure:
Standard image High-resolution imageNext, we initialize the system with a nonlinear perturbation . This leads to nonlinear steepening of the wave towards the formation of a shock front. The two right columns of figure 8 show the GRINN and FD density and velocity solutions along with the relative mismatch. We lower the mismatch by using a deeper network than for the three test cases in section 4, here consisting of 7 hidden layers each with 32 neurons. Since the LT is invalid in this regime, we compare the GRINN solution only with the FD result. Overall, GRINN captures the nonlinear solution with high precision except at the shock front where the mismatch approaches
. We expect an increasing difference between the solutions particularly because the FD scheme contains numerical dissipation that will smear out the shock front. The GRINN on the other hand is solving equations that contain no innate viscosity.
As a practical matter, FD codes employ various shock-capturing algorithms [48] and/or artificial viscosity terms to reliably model a shock front. Similarly, for PINN-based models, shock-capturing techniques such as clustered collocation points [32], adaptive artificial viscosity [10], and additional input data [35] have been shown to improve performance on shocks. This will be a fruitful topic to explore in follow-up work on the details of shock modeling, but is not our main topic of interest in this work.
5.3. Extrapolating solutions in time
In principle, PINNs can extrapolate solutions beyond the time range over which they are trained [26]. This makes it possible to adapt PINNs for various applications of PDE solutions [21, 47, 54]). Extrapolating in time is possible with GRINN as well. We demonstrate this by obtaining a solution over the interval for a set of BCs and ICs. Keeping the parameters θ of the trained network unchanged, we apply the network to make predictions up through a later end time t = 3. We use this procedure for case 1 and case 3 from section 4. The left top and bottom panels in figure 9 show the density solution versus x at
for case 1 and case 3, respectively. We compare the solution with the FD result. The volume-averaged mismatch
remains less than 1% up to t = 3. This shows GRINN can potentially be applied to surrogate modeling [54], uncertainty quantification, and inverse analysis [52].
Figure 9. Top left: extrapolated density solutions (solid colored lines) at four times using GRINN for case 1. Bottom left: the same but for case 3. Dashed colored lines are the corresponding solutions using the FD method. Right: growth of the volume-averaged mismatch with time in the extrapolated density solutions. Error bars indicate the domain-wide standard deviation in ε. The shaded region marks the time up to which GRINN was trained.
Download figure:
Standard image High-resolution image5.4. Limitations and future scope
One of the main limitations of neural network-based architectures is their dependence on the choice of hyperparameters, such as the number of hidden layers, the number of neurons in each layer, and the activation function. The lack of well-established scaling laws for PINNs makes it difficult to know how to choose these hyperparameters in any given situation. Most often one resorts to trial-and-error to select the network parameters for specific systems, monitoring the residual and/or the mismatch with known solutions. Gaining more insight into how each hyperparameter influences the model accuracy and speed will be essential to making PINNs convenient for routine use.
Being a basic deep neural network, GRINN sometimes struggles to converge on an accurate solution as the space and time extent of the domain are increased. Specifically, GRINN performance suffers when trained on the three test cases out to end times t > 7. This becomes evident as the mismatch ε tends to increase with the increase in integration time. Adapting advanced techniques like domain decomposition [13, 36] and evolutional deep neural networks [14] may help resolve this issue, but this is beyond the scope of our present work.
One can further extend the GRINN framework to build a surrogate model. The idea behind surrogate modeling is to mimic the behavior of a given set of numerical simulations. Surrogate models can learn the complex relationships between input and output variables for a wide range of parameters describing/defining a physical system. This is achieved by training the model with simulations/solutions that span over a finite range of parameters as well as different BCs or ICs. This enables fast and accurate prediction of the system behavior for any of the parameters within the trained domain without explicitly solving the equations or training the model each time. PINN-based surrogate models can approximate the behavior of complicated physical systems where the underlying dynamics or the physical laws are governed by PDEs. Leveraging the extrapolation capabilities of GRINN, one can build these surrogate models to emulate the solutions in an effective way [15, 54].
6. Summary and conclusions
We introduced a physics-informed neural network called GRINN for efficiently solving the coupled set of PDEs describing the evolution of self-gravitating flows in one, two, and three spatial dimensions. To our knowledge, this is the first demonstration of a PINN for tracing the growth of gravitational instability. Improved computational speed in modeling such flows could profoundly impact our understanding of star formation, which couples processes operating across a vast range of scales, from the diameter of our Galaxy down to the molecular mean free path that sets the thickness of shocks. Traditional finite difference and finite volume codes have struggled to span this range, even using adaptive meshes. PINNs are mesh-free and offer a fundamentally different approach to solving such PDEs.
We investigated three test cases for accuracy and speed relative to a finite difference code that implements the Lax method. GRINN solved for the evolution of self-gravitating, small-amplitude perturbations about as accurately as the FD code when comparing both against the linear analytic solution. This was true in the long-wavelength regime, where the perturbations are stationary and unstable with the overdensities collapsing under their own gravity. Further, this holds true in the short-wavelength regime as well, where the perturbations are stable and propagating but travel slower than the regular sound speed owing to their self-gravity. Finally, long-wavelength unstable perturbations growing into the nonlinear regime and steepening into shocks were evolved similarly in the GRINN and FD codes.
For the purpose of benchmarking the GRINN code, we ran the test cases in various dimensions. The wavevector was aligned with one of the grid axes in some calculations and inclined to the axes in others. GRINN was slower than the FD method in 1D and 2D, owing to the overhead cost of optimizing the network parameters during the training process. This yielded a solution compatible with the PDEs while satisfying the initial and BCs. However, the number of operations required for training was almost independent of the number of dimensions involved, in contrast to FD methods where the calculation time increases geometrically with the number of dimensions. For the 3D calculations, GRINN obtained the final solution more than an order of magnitude faster than the FD code (figure 7).
Our overall result is that the GRINN proved as accurate as the FD method on this set of self-gravitating flow test problems, and significantly faster than the FD method for the subset where the flow was three dimensional. These are largely due to the GRINNs superior scaling as the dimensionality increases and the lack of time step restrictions for stability. We conclude that the GRINN and other physics-informed neural networks hold the potential to substantially increase the scientific community's capability to model the most complex astrophysical flows.
The source code for GRINN is available on the GitHub software repository at https://github.com/sauddy/GRINN.
Acknowledgments
S A is supported by the NASA Postdoctoral Program (NPP). S B is supported by a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada. This work utilized computing resources provided by the Digital Research Alliance of Canada. A portion of the research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under Contract 80NM0018D0004 with the National Aeronautics and Space Administration.
Data availability statement
No new data were created or analysed in this study.
Appendix A: Linear theory
We linearize the self-gravitating isothermal hydrodynamic equations (1)–(3) along lines laid out over a century ago by James Jeans [25] over the spatial coordinate x, giving the background values subscript 0 and the small perturbations subscript 1:

where v is the velocity along the x-direction. We set v0 = 0, such that there is no background velocity. We furthermore employ the 'Jeans swindle' that the background values g0 and ρ0 satisfy equation (3), even though we choose uniform values for ρ0 and . This inconsistency amounts to assuming that the background state is held in static equilibrium by some unspecified forces. Dropping terms above the first order gives



These three combine into the modified wave equation

A form can also be derived where ρ1 is eliminated, leaving v1 as the dependent variable. Since equation (21) and its v1 counterpart have constant coefficients, we can decompose ρ1 and v1 into Fourier components of the form

where ω and k are the angular frequency and the wavenumber, respectively. We find the dependence of ω on k by substituting the Fourier components back into equation (21), yielding

This dispersion relation indicates instability and exponential growth if , or

Since the perturbation's wavelength , the above implies a special scale, the Jeans length,

such that for the disturbance grows exponentially, while for
the disturbance is a propagating wave. Each mode in the physical space is the real part of the complex function, so


If is a real number, then

Similarly for v1, we use equation (18) to get

where . If ω is real, i.e.
, we get propagating waves. Generally, we can write
, where

is the phase speed of the waves. Therefore we can also identify the velocity wave amplitude as

In the limit of no gravity, and
as in ordinary sound waves.
If , then the disturbance is unstable with
where

Thus

The solution with the minus sign is the exponentially-growing one, having

Thus the instability's growth time and in the limit
or
, τ approaches the free-fall time
.
Appendix B: Finite difference method
For an equation of the form

the finite-difference approximation using the Lax method in 3D is

We rewrite the momentum equation (2) along the x-direction in the flux conservative form as

Similar expressions are also derived along the y- and z-directions. These three equations along with the continuity equation

are solved using the Lax method. The forward time step is regulated using the Courant condition for stability,

After each time step, we update the gravitational field by solving the Poisson equation using a Fast Fourier transform. We solve the Poisson equation on a finite difference grid using second-order differencing. The modes φk
and ρk
in the wavenumber space are related by

where each component ,
and L is the domain length. This relation satisfies the discrete representation of the Poisson equation.