3 min read

Mathematical Prerequisites for Scientific Machine Learning-1

Mathematical Prerequisites for Scientific Machine Learning-1
Photo by Bozhin Karaivanov / Unsplash

Let’s start off by discussing some math pre-requisites concepts which are bread and butter for understanding research papers in Scientific Machine Learning.

Vectors

The primary independent vector in all of scientific machine learning is the shape vector or it could be position vector (depending on context of the problem).

Time, though a scalar, can be thought of as a vector of time stamps \(t_1, t_2, ... t_n \).

Then there exists other quantities such as fields! For example, temperature is a scalar field while velocity is a vector field. The fields are always defined with respect to fundamental vectors. So when someone says temperature vector is \( T\), what do they mean? It implies there is a shape matrix \(X\) which is a collection of all the points on the physical domain. For instance, let's say you are heating a 1-D rod at one of its ends. Then shape matrix \(X\) is (\(x_1, x_2, ...x_n\)) where each \(x_i\) is a vector pointing to a specific location on the rod. In this context, the temperature vector is the temperature at all those locations at time \(t\).

Matrix

In a 3D coordinate system, a 3x3 matrix is just a collection of 3 position vectors. In a general sense consider \(A\) a matrix of dimension \(m \times n\), \(x\) a vector of dimension \(n\) and \(y\) a vector of dimension \(m\), when we say \(Ax=y\), it implies \(A\) maps \(x\) to \(y\) or \(A\) transforms \(x\) to \(y\).

Transformation

Loosely speaking, a function that maps a input vector space to an output vector space is a transformation. Any matrix \(A\) is a an example of transformation.

Linear Transformation

Take two parallel vectors and apply a transformation, if the transformation preserves the ‘parallel’ lines in the output space then the applied transformation is linear. Also the origin after applying the transformation should stay as the origin for it to be a linear transformation. Any matrix multiplication to a vector is a linear transformation. Any transformation \(T\) that obeys \[ T(\alpha x + \beta y) = \alpha Tx + \beta Ty \] is a linear transformation, where \(\alpha\) and \(\beta\) are scalars.

Determinants

Each matrix has a value of determinant associated with it. In 2D coordinate system the factor by which a unit area gets scaled is the determinant of that matrix. For example, consider \(A\) defined by

\[A = \begin{bmatrix} 3 & 0 \\ 0 & 2 \end{bmatrix} \]

which has a determinant 6. So any area in a 2 \( \times \) 2 input space has 6 times its original area in the output space.

Eigenvector & Eigenvalues

Under a linear transformation some special vectors only stretch or compress by a factor and do not rotate hence preserving their directions. Such special vectors are called eigenvectors. And the scaling factor is called the eigenvalue. Such vectors are solutions to the equation

\[ Ax = \lambda x\]

Operator

An operator is a special function that takes a function as an input and gives another function as an output. Taking a derivative with respect to \(x\) is an example of operator i.e., \( \frac{d}{dx} \).

Linear Operator

Any operator that obeys linear transformation is a linear operator. Taking a derivative or an integral, both are linear operators since they easily satisfy \( T(\alpha x + \beta) = \alpha Tx + \beta Ty \)

Eigenfunctions

Eigenfunctions are analogues of eigenvectors but in infinite dimensional space. Eigenvectors arise as a consequence of applying a linear transformation through a matrix \(A\), and the eigenvectors satisfy \(Ax = \lambda x\). Similarly, eigenfunctions are consequences of applying linear operators. Some special functions under the transformation of a linear operator get scaled, but their 'shape' is preserved! For instance, consider the equation

\[\frac{d^2 sin(nx)}{dx^2} = -n^2 sin(x)\]

After the linear operator \(\frac{d^2}{dx^2}\), the frequency of the \(sin(nx)\) function has not changed and only it's amplitude changed. Hence, \(sin(nx)\) is a eigenfunction of the linear operator \(\frac{d}{dx^2}\). The amplitude, \(n ^2\), is the eigenvalue.

Conclusion and Looking Forward

These are some go-to concepts that occur as recurring themes in not only SciML, but also in many other scientific fields. Hope this helps, next week we'll cover some other relevant prerequisites such as projections, eigenbasis, basis vectors and a few more. And let's also learn how all these concepts tie together in the context of DiffusionNet!