Physics-Informed Neural Networks for SPDEs
Spatio-temporal data are prevalent in many applications within modern ecology and climate science . Statistical modeling of such data presents a significant challenge, historically focusing on Gaussian random fields (GRFs) and kriging for prediction . In a seminal paper, Lindgren et al. (2011) proposed an inference methodology based on the fact that certain GRFs can be expressed as solutions to stochastic partial differential equations (SPDEs). The most famous example is the spatial Matérn GRFs, which are solutions to the diffusion-type equation:
\[\begin{equation} (\kappa^2-\Delta)^{\alpha/2} u = \mathcal{W}, \end{equation}\]
where \(\mathcal{W}\) is a stochastic forcing term (e.g., white noise), and \(\theta = (\kappa, \alpha)\) are the model parameters, related to the Matérn covariance function. This approach bridges the gap between physical and statistical modeling, leading to a wealth of research refining the Equation above to model a broader class of random fields, and developing statistical inference procedures to estimate the model parameters . Most of these methods rely on a mesh-based approach, using finite elements or volumes to discretize the equation over a finite set of basis functions. On the other hand, physics-informed neural networks (PINNs, Raissi et al., 2019) have recently been introduced to solve partial differential equations \(\mathcal{N}[u] = 0\), where \(\mathcal{N}\) is an arbitrary differential operator. The goal is to find the best neural network \(u_\nu\) representing the solution by minimizing its PDE residuals computed at randomly sampled collocation points. This mesh-free approach has proven useful in various contexts and can be extended to inverse problems where the objective is to learn the parameters \(\theta\) of the differential operator from certain observations of the solution.
This work aims to generalize the PINN approach to solve SPDEs. To do so, several modifications to the deterministic framework are necessary and will need to be explored. First, the neural network must represent a stochastic process, which can be achieved through generative modeling where the network \(u_\nu(Z)\) has an additional latent variable \(Z\sim \mathbb{Q}\) as input. The quantity of interest then becomes the distribution \(\mathbb{P}_\nu\) of the network’s SPDE residuals \(\mathcal{N}[u_\nu](Z)\). This can be interpreted as a generative model, where \(\mathbb{P}_\nu := \mathcal{N}[u_\nu](\cdot) \# \mathbb{Q}\) is the pushforward of the base distribution of \(Z\) through the PINN.
Second, it is necessary to define an appropriate loss function that accounts for the stochastic nature of the objects involved. Since SPDEs prescribe equality in distribution, a natural choice is to consider a measure of similarity between the probability distributions \(D(\mathbb{P}_\nu, \mathcal{W})\). Several options are possible, such as Kullback-Leibler divergence, pp-Wasserstein distance as in Arjovsky et al. (2017), or maximum mean discrepancy associated with a certain reproducing kernel (Gretton et al., 2012), each leading to different learning strategies for the network parameters \(\nu\).
Starting from the Equation above, the goal is to explore and implement the various methodologies discussed above.