IntroScientificMachineLearning
Short intro to scientific machine learning using physics informed neuronal networks. I used PyTorch as a framework.
Install / Use
/learn @valentino-golob/IntroScientificMachineLearningREADME
IntroScientificMachineLearning
Short intro to scientific machine learning. I used PyTorch as a framework. Some equations are not rendered properly in the .ipynb version within github.
Introduction to Scientific Machine Learning
<br> Scientific machine learning is an approach to solve problems in the domain of scientific computing, using neuronal networks and other machine learning techniques. One primary object is to use the traits of neuronal networks to enhance the manner one examines scientific models. <a href="https://mitmath.github.io/18337/lecture3/sciml.html">[Introduction to Scientific Machine Learning through Physics-Informed Neural Networks, Chris Rackauckas]</a> Through the use of learning directly nonlinear connections from ground truth data, machine learning methods allow us to surmount the imprecisions of an approximative mechanistic model. In order to produce precise predictions, conventional machine learning models rely on an extensive amount of training data. On this matter scientific machine learning merges physical models (e.g. differential equations) with classical machine learning techniques (e.g. neuronal networks) to generate predictions in a more data-efficient manner. For instance Physics-Informed Neuronal Networks (PINNs) engage differential equations in the loss function in order to integrate prior scientific knowledge. One drawback of PINNs is that the resulting models do not have the comprehensibility of classical mechanistic models. <br> Mechanistic models are restrained to employ prior scientific knowledge from literature, whereas the data-driven machine learning methods are more adaptable and do not utilize simplified assumptions to deduce the underlying mathematical models. As a result the main goal of scientific machine learning is to combine the benefits of both approaches and alleviate their individual detriments. <a href="https://arxiv.org/abs/2001.04385">[Universal Differential Equations for Scientific Machine Learning , Chris Rackauckas]</a><br>Phyisics-Informed Neuronal Networks (PINNs)
The following example is very much related to Chris Rackauckas course notes in <a href="https://mitmath.github.io/18337/lecture3/sciml.html">Introduction to Scientific Machine Learning through Physics-Informed Neural Networks</a>. <br> <br>
As aforesaid PINNs use differential equations in the cost function of a neuronal network somewhat like a regularizer, or solve a differential equation with a neuronal network. Consequently, the mathematical equations can steer the training of the neuronal network in conditions where ground truth data might not be present. We want to solve an ordinary differential equation with a given initial condition <img src="https://latex.codecogs.com/svg.image?\inline&space;u(0)&space;=&space;u_0" title="\inline u(0) = u_0" /> and <img src="https://latex.codecogs.com/svg.image?\inline&space;t&space;\in&space;\left[0,1\right]&space;&space;" title="\inline t \in \left[0,1\right] " />: <br><br> <img src="https://latex.codecogs.com/svg.image?u^\prime&space;=&space;f(u,&space;t)&space;\quad&space;\quad&space;(1)&space;&space;" title="u^\prime = f(u, t) \quad \quad (1) " /> <br><br> In an initial step, we calculate an approximate solution given by a neuronal network: <br><br> <img src="https://latex.codecogs.com/svg.image?NN\left(t\right)&space;\approx&space;u\left(t\right)&space;\quad&space;\quad&space;(2)&space;&space;" title="NN\left(t\right) \approx u\left(t\right) \quad \quad (2) " /> <br><br> We can derive that <img src="https://latex.codecogs.com/svg.image?\inline&space;NN^\prime(t)&space;=&space;f(NN(t),t)&space;\;&space;\forall&space;\;t&space;" title="\inline NN^\prime(t) = f(NN(t),t) \; \forall \;t " />, if <img src="https://latex.codecogs.com/svg.image?\inline&space;NN(t)&space;" title="\inline NN(t) " /> is the actual solution. Hence, we can express our loss function in the following configuration: <br><br> <img src="https://latex.codecogs.com/svg.image?L(p)&space;=&space;\sum_i&space;\left(\frac{dNN(t_i)}{dt}&space;-&space;f\left(NN(t_i),&space;t_i\right)\right)^2&space;\quad&space;\quad&space;(3)&space;&space;" title="L(p) = \sum_i \left(\frac{dNN(t_i)}{dt} - f\left(NN(t_i), t_i\right)\right)^2 \quad \quad (3) " /> <br><br> Therefore, one obtains that <img src="https://latex.codecogs.com/svg.image?\inline&space;\frac{dNN(t_i)}{dt}&space;\approx&space;f\left(NN(t_i),t_i\right)&space;" title="\inline \frac{dNN(t_i)}{dt} \approx f\left(NN(t_i),t_i\right) " /> when the loss function is minimized. Consequently, <img src="https://latex.codecogs.com/svg.image?\inline&space;NN(t)" title="\inline NN(t)" /> solves the differential equation approximative. Within this study our <img src="https://latex.codecogs.com/svg.image?\inline&space;t_i" title="\inline t_i" /> values will be created randomly. Further there are different sampling techniques available. For instance the prominent grid size method. For advanced problems which incorporate a high number of input dimensions one should use sampling techniques which sample the space of input dimensions in a more efficient manner. (e.g. Latin Hypercube) <a href="https://mitmath.github.io/18337/lecture17/global_sensitivity">[Global Sensitivity Analysis, Chris Rackauckas]</a> Up to now the initial conditions of our ordinary differential equation have not been integrated. A first simple approach would be to incorporate the initial condition in the loss function. <br><br> <img src="https://latex.codecogs.com/svg.image?L(p)&space;=&space;\left(NN(0)&space;-&space;u_0\right)^2&space;+&space;\sum_i&space;\left(\frac{dNN(t_i)}{dt}-f\left(NN(t_i),t_i\right)\right)^2&space;\quad&space;\quad&space;(4)&space;&space;" title="L(p) = \left(NN(0) - u_0\right)^2 + \sum_i \left(\frac{dNN(t_i)}{dt}-f\left(NN(t_i),t_i\right)\right)^2 \quad \quad (4) " /> <br><br> The downside of this method is that by writing the loss function in this form one still has a constrained optimization problem. An unconstrained opimization problem is more efficient to encode and easier to handle. Hence, we choose a trial function <img src="https://latex.codecogs.com/svg.image?\inline&space;g(t)&space;" title="\inline g(t) " /> such that it fulfills the initial condition by construction. <a href="https://arxiv.org/abs/physics/9705023#:~:text=Artificial%20Neural%20Networks%20for%20Solving%20Ordinary%20and%20Partial%20Differential%20Equations,-I.%20E.%20Lagaris%2C%20A&text=We%20present%20a%20method%20to,problems%20using%20artificial%20neural%20networks.&text=Hence%20by%20construction%20the%20boundary,to%20satisfy%20the%20differential%20equation.">[Artificial Neural Networks for Solving Ordinary and Partial Differential Equations, Isaac E. Lagaris]</a> <br><br> <img src="https://latex.codecogs.com/svg.image?g(t)&space;=&space;u_0&space;+&space;t&space;\cdot&space;NN(t)&space;\quad&space;\quad&space;(5)&space;&space;" title="g(t) = u_0 + t \cdot NN(t) \quad \quad (5) " /> <br><br> As aforesaid <img src="https://latex.codecogs.com/svg.image?\inline&space;g(t)&space;" title="\inline g(t) " /> always satisfies the initial condition, thus one can train the trial function <img src="https://latex.codecogs.com/svg.image?\inline&space;g(t)&space;" title="\inline g(t) " /> to meet the requirements for the derivative function <img src="https://latex.codecogs.com/svg.image?\inline&space;f\left(g(t_i),t_i\right)" title="\inline f\left(g(t_i),t_i\right)" />. <br><br> <img src="https://latex.codecogs.com/svg.image?L(p)&space;=&space;\sum_i&space;\left(\frac{dg(t_i)}{dt}&space;-&space;f\left(g(t_i),&space;t_i\right)\right)^2&space;\quad&space;\quad&space;(6)&space;&space;" title="L(p) = \sum_i \left(\frac{dg(t_i)}{dt} - f\left(g(t_i), t_i\right)\right)^2 \quad \quad (6) " /> <br><br> Accordingly, we have that <img src="https://latex.codecogs.com/svg.image?\inline&space;\frac{dg(t_i)}{dt}&space;\approx&space;f\left(g(t_i),&space;t_i\right)&space;&space;" title="\inline \frac{dg(t_i)}{dt} \approx f\left(g(t_i), t_i\right) " />, whilst our neuronal network <img src="https://latex.codecogs.com/svg.image?\inline&space;NN(t)&space;&space;" title="\inline NN(t) " /> is embedded in the trial function <img src="https://latex.codecogs.com/svg.image?\inline&space;g(t)&space;" title="\inline g(t) " />. One must note that the loss function <img src="https://latex.codecogs.com/svg.image?\inline&space;L(p)&space;&space;" title="\inline L(p) " /> is dependent on the parameters <img src="https://latex.codecogs.com/svg.image?\inline&space;p&space;&space;" title="\inline p " />, which correspond to the weights and biases of the neuronal network <img src="https://latex.codecogs.com/svg.image?\inline&space;NN(t,&space;p)&space;&space;" title="\inline NN(t, p) " />. In order to solve the given problem the conventional gradient optimization methods which find the weights to minimize the loss function can be used.<br> <br> In the next step we will look at a specific ordinary differential equation (ODE) with the intention of coding up the procedure based on the machine learning framework PyTorch. The given ODE is: <br><br> <img src="https://latex.codecogs.com/svg.image?u^\prime&space;=&space;\cos(2\pi&space;t)&space;\quad&space;\quad&space;(7)&space;" title="u^\prime = \cos(2\pi t) \quad \quad (7) " /> <br><br> with <img src="https://latex.codecogs.com/svg.image?\inline&space;t&space;\,&space;\in&space;\left[0,1\right]&space;&space;" title="\inline t \, \in \left[0,1\right] " /> and the known initial condition <img src="https://latex.codecogs.com/svg.image?\inline&space;u(0)&space;=&space;1&space;&space;" title="\inline u(0) = 1 " />. Thus we will use <img src="https://latex.codecogs.com/svg.image?\inline&space;g(t)&space;" title="\inline g(t) " /> as our universal approximator (UA): <br><br> <img src="https://latex.codecogs.com/svg.im
