6. Gaussian process class

class lsqfitgp.GP(covfun=None, *, solver='eigcut+', checkpos=True, checksym=True, checkfinite=True, checklin=True, posepsfac=1, **kw)

Object that represents a Gaussian process over arbitrary input.

Parameters:
covfunKernel or None

An instance of Kernel representing the covariance kernel of the default process of the GP object. It can be left unspecified.

solverstr

The algorithm used to decompose the prior covariance matrix. See decompose() for the available solvers. Default is eigcut+.

checkposbool

If True (default), raise a LinAlgError if the prior covariance matrix turns out non positive within numerical error.

checksymbool

If True (default), check that the prior covariance matrix is symmetric. If False, only half of the matrix is computed.

checkfinitebool

If True (default), check that the prior covariance matrix does not contain infs or nans.

checklinbool

If True (default), the method addlintransf will check that the given transformation is linear on a random input tensor.

posepsfacnumber

The threshold used to check if the prior covariance matrix is positive definite is multiplied by this factor (default 1).

**kw

Additional keyword arguments are passed to the solver, see decompose().

Methods

addx(x[, key, deriv, proc])

Add points where the Gaussian process is evaluated.

addlintransf(transf, keys, key, *[, checklin])

Define a finite linear transformation of the evaluated process.

addtransf(tensors, key, *[, axes])

Apply a linear transformation to already specified process points.

addcov(covblocks[, key, decomps])

Add user-defined prior covariance matrix blocks.

addproc([kernel, key, deriv])

Add an independent process.

addproctransf(ops[, key, deriv])

Define a new process as a linear combination of other processes.

addproclintransf(transf, keys[, key, deriv, ...])

Define a new process as a linear combination of other processes.

addkernelop(method, arg, key[, proc])

Define a new process as the transformation of an existing one.

addprocderiv(deriv, key[, proc])

Define a new process as the derivative of an existing one.

addprocxtransf(transf, key[, proc])

Define a new process by transforming the inputs of another one.

addprocrescale(scalefun, key[, proc])

Define a new process as a rescaling of an existing one.

prior([key, raw])

Return an array or a dictionary of arrays of gvars representing the prior for the Gaussian process.

pred(given[, key, givencov, fromdata, raw, ...])

Compute the posterior.

predfromfit(*args, **kw)

Like pred with fromdata=False.

predfromdata(*args, **kw)

Like pred with fromdata=True.

marginal_likelihood(given[, givencov, separate])

Compute the logarithm of the probability of the data.

decompose(posdefmatrix[, solver])

Decompose a nonnegative definite matrix.

class DefaultProcess

Key for the default process in GP objects

addcov(covblocks, key=None, *, decomps=None)

Add user-defined prior covariance matrix blocks.

Covariance matrices defined with addcov represent arbitrary finite-dimensional zero-mean Gaussian variables, assumed independent from all other variables in the GP object.

Parameters:
covblocksarray or dictionary of arrays

If an array: a covariance matrix (or tensor) to be added under key key. If a dictionary: a mapping from pairs of keys to the corresponding covariance matrix blocks. A missing off-diagonal block in the dictionary is interpreted as a matrix of zeros, unless the corresponding transposed block is specified.

keyhashable

If covblocks is an array, the dictionary key under which covblocks is added. Can not be specified if covblocks is a dictionary.

decompsDecomposition or dict of Decompositions

Pre-computed decompositions of (not necessarily all) diagonal blocks, as produced by decompose(). The keys are single GP keys and not pairs like in covblocks.

Raises:
KeyError

A key is already used in the GP.

ValueError

covblocks and/or key and decomps are malformed or inconsistent.

TypeError

Wrong type of covblocks or decomps.

addkernelop(method, arg, key, proc=DefaultProcess)

Define a new process as the transformation of an existing one.

Parameters:
methodstr

A method of Kernel taking two arguments which returns a transformed kernel.

argobject

A valid argument to the method.

keyhashable

Key for the new process.

prochashable

Key of the process to be transformed. If not specified, use the default process.

addlintransf(transf, keys, key, *, checklin=None)

Define a finite linear transformation of the evaluated process.

Parameters:
transfcallable

A function with signature f(array1, array2, …) -> array which computes the linear transformation. The function must be jax-traceable, i.e., use jax.numpy instead of numpy.

keyssequence

Keys of parts of the process to be passed as inputs to the transformation.

keyhashable

The key of the newly defined points.

checklinbool

If True (default), check that the given function is linear in its inputs. The default can be overridden at initialization of the GP object. Note that an affine function (x -> a + bx) is not linear.

Raises:
RuntimeError

The transformation seems not to be linear. To disable the linearity check, initialize the GP with checklin=False.

addproc(kernel=None, key=DefaultProcess, *, deriv=0)

Add an independent process.

Parameters:
kernelKernel

A kernel for the process. If None, use the default kernel. The difference between the default process and a process defined with the default kernel is that, although they have the same kernel, they are independent.

keyhashable

The name that identifies the process in the GP object. If not specified, sets the kernel of the default process.

derivDeriv-like

Derivatives to take on the process defined by the kernel.

addprocderiv(deriv, key, proc=DefaultProcess)

Define a new process as the derivative of an existing one.

\[g(x) = \frac{\partial^n}{\partial x^n} f(x)\]
Parameters:
derivDeriv-like

Derivation order.

keyhashable

The key of the new process.

prochashable

The key of the process to be derived. If not specified, use the default process.

addproclintransf(transf, keys, key=DefaultProcess, *, deriv=0, checklin=False)

Define a new process as a linear combination of other processes.

Let f_i(x), i = 1, 2, … be already defined processes, and T a linear map from processes to a single process. The new process is

h(x) = T(f_1, f_2, …)(x).

Parameters:
transfcallable

A function with signature transf(callable, callable, …) -> callable.

keyssequence

The keys of the processes to be passed to the transformation.

keyhashable

The name that identifies the new process in the GP object. If not specified, sets the default process.

derivDeriv-like

The linear combination is derived as specified by this parameter.

checklinbool

If True, check if the transformation is linear. Default False.

Notes

The linearity check may fail if the transformation does nontrivial operations with the inner function input.

addprocrescale(scalefun, key, proc=DefaultProcess)

Define a new process as a rescaling of an existing one.

\[g(x) = s(x)f(x)\]
Parameters:
scalefuncallable

A function from the domain of the process to a scalar.

keyhashable

The key of the new process.

prochashable

The key of the process to be transformed. If not specified, use the default process.

addproctransf(ops, key=DefaultProcess, *, deriv=0)

Define a new process as a linear combination of other processes.

Let f_i(x), i = 1, 2, … be already defined processes, and g_i(x) be deterministic functions. The new process is defined as

h(x) = g_1(x) f_1(x) + g_2(x) f_2(x) + …

Parameters:
opsdict

A dictionary mapping process keys to scalars or scalar functions. The functions must take an argument of the same kind of the domain of the process.

keyhashable

The name that identifies the new process in the GP object. If not specified, sets the default process.

derivDeriv-like

The linear combination is derived as specified by this parameter.

addprocxtransf(transf, key, proc=DefaultProcess)

Define a new process by transforming the inputs of another one.

\[g(x) = f(T(x))\]
Parameters:
transfcallable

A function mapping the new kind input to the input expected by the transformed process.

keyhashable

The key of the new process.

prochashable

The key of the process to be transformed. If not specified, use the default process.

addtransf(tensors, key, *, axes=1)

Apply a linear transformation to already specified process points. The result of the transformation is represented by a new key.

Parameters:
tensorsdict

Dictionary mapping keys of the GP to arrays/scalars. Each array is matrix-multiplied with the process array represented by its key, while scalars are just multiplied. Finally, the keys are summed over.

keyhashable

A new key under which the transformation is placed.

axesint

Number of axes to be summed over for matrix multiplication, referring to trailing axes for tensors in tensors, and to heading axes for process points. Default 1.

Notes

The multiplication between the tensors and the process is done with np.tensordot with, by default, 1-axis contraction. For >2d arrays this is different from numpy’s matrix multiplication, which would act on the second-to-last dimension of the second array.

addx(x, key=None, *, deriv=0, proc=DefaultProcess)

Add points where the Gaussian process is evaluated.

The GP object keeps the various x arrays in a dictionary. If x is an array, you have to specify its dictionary key with the key parameter. Otherwise, you can directly pass a dictionary for x.

To specify that on the given x a derivative of the process instead of the process itself should be evaluated, use the parameter deriv.

addx never copies the input arrays if they are numpy arrays, so if you change their contents before doing something with the GP, the change will be reflected on the result. However, after the GP has computed internally its covariance matrix, the x are ignored.

Parameters:
xarray or dictionary of arrays

The points to be added.

keyhashable

If x is an array, the dictionary key under which x is added. Can not be specified if x is a dictionary.

derivDeriv-like

Derivative specification. A Deriv object or something that can be converted to Deriv.

prochashable

The process to be evaluated on the points. If not specified, use the default process.

classmethod decompose(posdefmatrix, solver='eigcut+', **kw)

Decompose a nonnegative definite matrix.

The decomposition can be used to calculate linear algebra expressions where the (pseudo)inverse of the matrix appears.

Parameters:
posdefmatrixarray

A nonnegative definite nonempty symmetric square matrix. If the array is not square, it must have a shape of the kind (k, n, m, …, k, n, m, …) and is reshaped to (k * n * m * …, k * n * m * …).

solverstr

Algorithm used to decompose the matrix.

‘eigcut+’ (default)

Promote small (or negative) eigenvalues to a minimum value.

‘eigcut-’

Remove small (or negative) eigenvalues.

‘svdcut+’

Promote small eigenvalues to a minimum value, keeping their sign.

‘svdcut-’

Remove small eigenvalues.

‘lanczos’

Reduce the rank of the matrix. The complexity is O(n^2 r) where n is the matrix size and r the required rank, while the other algorithms are O(n^3). Slow for small sizes.

‘lobpcg’

Like ‘lanczos’ but using the LOBPCG algorithm, faster but less accurate.

‘chol’

Cholesky decomposition after regularizing the matrix with a Gershgorin estimate of the maximum eigenvalue. The fastest of the O(n^3) algorithms.

**kw

Additional options.

epsrel, epsabspositive float or ‘auto’

For solvers ‘eigcut+’, ‘eigcut-’, ‘svdcut+’, ‘svdcut-’, ‘chol’. Specify the threshold for considering small the eigenvalues:

eps = epsrel * maximum_eigenvalue + epsabs

epsrel=’auto’ sets epsrel = matrix_size * float_epsilon, while epsabs=’auto’ sets epsabs = float_epsilon. Default is epsrel=’auto’, epsabs=0.

rankpositive integer

For the ‘lanczos’ and ‘lobpcg’ solvers, the target rank. It should be much smaller than the matrix size for the method to be convenient.

direct_autodiffbool

If True, let JAX compute derivatives tracing through the code instead of using custom derivatives for the various operations. Default False.

stop_hessianbool

If True, when computing second derivatives, pretend that the matrix has zero second derivatives w.r.t. the inputs. Does not work in reverse mode. Default False.

Returns:
decompDecomposition

An object representing the decomposition of the matrix. The available methods and properties are (K being the matrix):

n

The size of the matrix.

matrix()

Return K.

inv()

Compute K⁺.

solve(b)

Compute K⁺b.

quad(b[, c])

Compute b’K⁺b or b’K⁺c.

diagquad(b)

Compute the diagonal of b’K⁺b.

logdet()

Compute log(det(K)).

tracesolve(b)

Compute tr(K⁺b).

correlate(b[, transpose=True])

Compute Ab or A’b such that K = AA’, with A n x m.

decorrelate(b[, transpose=True])

Compute A⁺b or A⁺’b with A as above.

m

The inner size of A.

eps

The threshold below which eigenvalues are not calculable.

Notes

The decomposition operations are JAX-traceable and differentiable (with some exceptions). Since intermediate values are stored in the decomposition object, it is necessary to trace a function that includes both the call to decompose and to one of the methods listed above. The decomposition object can be passed as input/output in jax-traceable functions.

marginal_likelihood(given, givencov=None, *, separate=False, **kw)

Compute the logarithm of the probability of the data.

The probability is computed under the Gaussian prior and Gaussian error model. It is also called marginal likelihood. If y is the data and g is the Gaussian process, this is

\[\log \int p(y|g) p(g) \mathrm{d} g.\]

Unlike pred(), you can’t compute this with a fit result instead of data. If you used the Gaussian process as latent variable in a fit, use the whole fit to compute the marginal likelihood. E.g. lsqfit always computes the logGBF (it’s the same thing).

The input is an array or dictionary of arrays, given. The contents of given represent the input data.

Parameters:
givendictionary of arrays

The data for some/all of the points in the GP. The arrays can contain either gvars or normal numbers, the latter being equivalent to zero-uncertainty gvars.

givencovdictionary of arrays, optional

Covariance matrix of given. If not specified, the covariance is extracted from given with gvar.evalcov(given).

separatebool

If True, return separately the logdet term and the residuals term. Default False.

**kw

Additional keyword arguments are passed to the matrix decomposition.

Returns:
If separate == False (default):
logpscalar

The logarithm of the marginal likelihood.

If separate == True:
logdetscalar

The term of the marginal likelihood containing the logarithm of the determinant of the prior covariance matrix, without the -1/2 factor.

residualsarray or dictionary of arrays

A vector whose squared 2-norm multiplied by -1/2 gives the other term of the marginal likelihood.

pred(given, key=None, givencov=None, *, fromdata=None, raw=False, keepcorr=None)

Compute the posterior.

The posterior can be computed either for all points or for a subset, and either directly from data or from a posterior obtained with a fit. The latter case is for when the Gaussian process was used in a fit with other parameters.

The output is a collection of gvars, either an array or a dictionary of arrays. They are properly correlated with gvars returned by prior and with the input data/fit.

The input is a dictionary of arrays, given, with keys corresponding to the keys in the GP as added by addx or addtransf.

Parameters:
givendictionary of arrays

The data or fit result for some/all of the points in the GP. The arrays can contain either gvars or normal numbers, the latter being equivalent to zero-uncertainty gvars.

keyNone, key or list of keys, optional

If None, compute the posterior for all points in the GP (also those used in given). Otherwise only those specified by key.

givencovdictionary of arrays, optional

Covariance matrix of given. If not specified, the covariance is extracted from given with gvar.evalcov(given).

fromdatabool

Mandatory. Specify if the contents of given are data or already a posterior.

rawbool, optional

If True, instead of returning a collection of gvars, return the mean and the covariance. When the mean is a dictionary, the covariance is a dictionary whose keys are pairs of keys of the mean (the same format used by gvar.evalcov). Default False.

keepcorrbool, optional

If True (default), the returned gvars are correlated with the prior and the data/fit. If False, they have the correct covariance between themselves, but are independent from all other preexisting gvars.

Returns:
If raw=False (default):
posteriorarray or dictionary of arrays

A collections of gvars representing the posterior.

If raw=True:
pmeanarray or dictionary of arrays

The mean of the posterior. Equivalent to gvar.mean(posterior).

pcov2D array or dictionary of 2D arrays

The covariance matrix of the posterior. If pmean is a dictionary, the keys of pcov are pairs of keys of pmean. Equivalent to gvar.evalcov(posterior).

predfromdata(*args, **kw)

Like pred with fromdata=True.

predfromfit(*args, **kw)

Like pred with fromdata=False.

prior(key=None, *, raw=False)

Return an array or a dictionary of arrays of gvars representing the prior for the Gaussian process. The returned object is not unique but the gvars stored inside are, so all the correlations are kept between objects returned by different calls to prior.

Calling without arguments returns the complete prior as a dictionary. If you specify key, only the array for the requested key is returned.

Parameters:
keyNone, key or list of keys

Key(s) corresponding to one passed to addx or addtransf. None for all keys.

rawbool

If True, instead of returning a collection of gvars return their covariance matrix as would be returned by gvar.evalcov. Default False.

Returns:
If raw=False (default):
priornp.ndarray or gvar.BufferDict

A collection of gvars representing the prior.

If raw=True:
covnp.ndarray or dict

The covariance matrix of the prior.