Python interface

The primary interface to computations using celerite2 are provided by the celerite2.GaussianProcess class that is documented below. These calculations will be performed using Term models as documented in the Model building section. See the TBD tutorial for examples of how to use this interface.

class celerite2.GaussianProcess(kernel, t=None, *, mean=0.0, **kwargs)

The main interface to the celerite2 Gaussian Process (GP) solver

Parameters
  • kernel – An instance of a subclass of terms.Term.

  • mean (optional) – The mean function of the process. This can either be a callable (it will be evaluated with a single argument, a vector of x values) or a scalar. (default: 0.0)

  • **kwargs – Other arguments will be passed directly to GaussianProcess.compute() if the argument t is specified.

apply_inverse(y, *, inplace=False)

Apply the inverse of the covariance matrix to a vector or matrix

Solve K.x = y for x where K is the covariance matrix of the GP.

Note

The mean function is not applied in this method.

Parameters
  • y (shape[N] or shape[N, M]) – The vector or matrix y described above.

  • inplace (bool, optional) – If True, y will be overwritten with the result x.

Raises
compute(t, *, yerr=None, diag=None, check_sorted=True, quiet=False)

Compute the Cholesky factorization of the GP covariance matrix

Parameters
  • t (shape[N]) – The independent coordinates of the observations. This must be sorted in increasing order.

  • yerr (shape[N], optional) – If provided, the diagonal standard deviation of the observation model.

  • diag (shape[N], optional) – If provided, the diagonal variance of the observation model.

  • check_sorted (bool, optional) – If True, a check is performed to make sure that t is correctly sorted. A ValueError will be thrown when this check fails.

  • quiet (bool, optional) – If True, when the matrix cannot be factorized (because of numerics or otherwise) the solver’s LinAlgError will be silenced and the determiniant will be set to zero. Otherwise, the exception will be propagated.

Raises
  • ValueError – When the inputs are not valid (shape, number, etc.).

  • LinAlgError – When the matrix is not numerically positive definite.

dot_tril(y, *, inplace=False)

Dot the Cholesky factor of the GP system into a vector or matrix

Compute x = L.y where K = L.L^T and K is the covariance matrix of the GP.

Note

The mean function is not applied in this method.

Parameters
  • y (shape[N] or shape[N, M]) – The vector or matrix y described above.

  • inplace (bool, optional) – If True, y will be overwritten with the result x.

Raises
log_likelihood(y, *, inplace=False)

Compute the marginalized likelihood of the GP model

The factorized matrix from the previous call to GaussianProcess.compute() is used so that method must be called first.

Parameters
  • y (shape[N]) – The observations at coordinates t as defined by GaussianProcess.compute().

  • inplace (bool, optional) – If True, y will be overwritten in the process of the calculation. This will reduce the memory footprint, but should be used with care since this will overwrite the data.

Raises
predict(y, t=None, *, return_cov=False, return_var=False, include_mean=True, kernel=None)

Compute the conditional distribution

The factorized matrix from the previous call to GaussianProcess.compute() is used so that method must be called first.

Parameters
  • y (shape[N]) – The observations at coordinates t as defined by GaussianProcess.compute().

  • t (shape[M], optional) – The independent coordinates where the prediction should be evaluated. If not provided, this will be evaluated at the observations t from GaussianProcess.compute().

  • return_var (bool, optional) – Return the variance of the conditional distribution.

  • return_cov (bool, optional) – Return the full covariance matrix of the conditional distribution.

  • include_mean (bool, optional) – Include the mean function in the prediction.

  • kernel (optional) – If provided, compute the conditional distribution using a different kernel. This is generally used to separate the contributions from different model components. Note that the computational cost and scaling will be worse when using this parameter.

Raises
recompute(*, quiet=False)

Re-compute the factorization given a previous call to compute

Parameters

quiet (bool, optional) – If True, when the matrix cannot be factorized (because of numerics or otherwise) the solver’s LinAlgError will be silenced and the determiniant will be set to zero. Otherwise, the exception will be propagated.

Raises
sample(*, size=None, include_mean=True)

Generate random samples from the prior implied by the GP system

The factorized matrix from the previous call to GaussianProcess.compute() is used so that method must be called first.

Parameters
  • size (int, optional) – The number of samples to generate. If not provided, only one sample will be produced.

  • include_mean (bool, optional) – Include the mean function in the prediction.

Raises
sample_conditional(y, t=None, *, size=None, regularize=None, include_mean=True, kernel=None)

Sample from the conditional (predictive) distribution

Note

this method scales as O(M^3) for large M, where M == len(t).

Parameters
  • y (shape[N]) – The observations at coordinates x from GausianProcess.compute().

  • t (shape[M], optional) – The independent coordinates where the prediction should be made. If this is omitted the coordinates will be assumed to be x from GaussianProcess.compute() and an efficient method will be used to compute the mean prediction.

  • size (int, optional) – The number of samples to generate. If not provided, only one sample will be produced.

  • regularize (float, optional) – For poorly conditioned systems, you can provide a small number here to regularize the predictive covariance. This number will be added to the diagonal.

  • include_mean (bool, optional) – Include the mean function in the prediction.

  • kernel (optional) – If provided, compute the conditional distribution using a different kernel. This is generally used to separate the contributions from different model components. Note that the computational cost and scaling will be worse when using this parameter.

Model building

Note

If you’re not sure which model to use, check out the Recommended models section and, in particular, the SHOTerm.

The following is the abstract base class for all the models provided by celerite2 and all of these methods will be implemented by subclasses. All of these models can be composed using addition and multiplication. For example,

from celerite2 import terms

term1 = terms.SHOTerm(sigma=1.0, w0=0.5, Q=2.5)
term2 = terms.SHOTerm(sigma=0.5, w0=0.5, Q=0.2)

term = term1 + term2

# or ...

term = term1 * term2
class celerite2.terms.Term

The abstract base “term” that is the superclass of all other terms

Subclasses should overload at least the Term.get_coefficients() method.

dot(x, diag, y)

Apply a matrix-vector or matrix-matrix product

Parameters
  • x (shape[N]) – The independent coordinates of the data.

  • diag (shape[N]) – The diagonal variance of the system.

  • y (shape[N] or shape[N, K]) – The target of vector or matrix for this operation.

get_celerite_matrices(x, diag, *, a=None, U=None, V=None, P=None)

Get the matrices needed to solve the celerite system

Pre-allocated arrays can be provided to the Python interface to be re-used for multiple evaluations.

Note

In-place operations are not supported by the modeling extensions.

Parameters
  • x (shape[N]) – The independent coordinates of the data.

  • diag (shape[N]) – The diagonal variance of the system.

  • a (shape[N], optional) – The diagonal of the A matrix.

  • U (shape[N, J], optional) – The first low-rank matrix.

  • V (shape[N, J], optional) – The second low-rank matrix.

  • P (shape[N-1, J], optional) – The regularization matrix used for numerical stability.

Raises

ValueError – When the inputs are not valid.

get_coefficients()

Compute and return the coefficients for the celerite model

This should return a 6 element tuple with the following entries:

(ar, cr, ac, bc, cc, dc)

Note

All of the returned objects must be arrays, even if they only have one element.

get_conditional_mean_matrices(x, t)

Get the matrices needed to compute the conditional mean function

Parameters
  • x (shape[N]) – The independent coordinates of the data.

  • t (shape[M]) – The independent coordinates where the predictions will be made.

get_psd(omega)

Compute the value of the power spectral density for this process

Parameters

omega (shape[..]) – The (angular) frequencies where the power should be evaluated.

get_value(tau)

Compute the value of the kernel as a function of lag

Parameters

tau (shape[..]) – The lags where the kernel should be evaluated.

to_dense(x, diag)

Evaluate the dense covariance matrix for this term

Parameters
  • x (shape[N]) – The independent coordinates of the data.

  • diag (shape[N]) – The diagonal variance of the system.

These are the best models to use for most datasets.

class celerite2.terms.SHOTerm(*args, **kwargs)

A term representing a stochastically-driven, damped harmonic oscillator

The PSD of this term is

\[S(\omega) = \sqrt{\frac{2}{\pi}} \frac{S_0\,\omega_0^4} {(\omega^2-{\omega_0}^2)^2 + {\omega_0}^2\,\omega^2/Q^2}\]

with the parameters S0, Q, and w0.

This implementation also supports the following reparameterizations that can be easier to use and reason about:

  1. rho, the undamped period of the oscillator, defined as \(\rho = 2\,\pi / \omega_0\),

  2. tau, the damping timescale of the process, defined as \(\tau = 2\,Q / \omega_0\), and

  3. sigma, the standard deviation of the process, defined as \(\sigma = \sqrt{S_0\,\omega_0\,Q}\).

Parameters
  • w0 – The undamped angular frequency, \(\omega_0\) above.

  • rho – Alternative parameterization for w0 as described above.

  • Q – The quality factor, \(Q\) above.

  • tau – Alternative parameterization for Q as described above.

  • S0 – The power at \(\omega = 0\), \(S_0\) above.

  • sigma – Alternative parameterization for S0 as described above.

  • eps (optional) – A regularization parameter used for numerical stability when computing \(\sqrt{1-4\,Q^2}\) or \(\sqrt{4\,Q^2-1}\).

class celerite2.terms.RotationTerm(*, sigma, period, Q0, dQ, f)

A mixture of two SHO terms that can be used to model stellar rotation

This term has two modes in Fourier space: one at period and one at 0.5 * period. This can be a good descriptive model for a wide range of stochastic variability in stellar time series from rotation to pulsations.

More precisely, the parameters of the two SHOTerm terms are defined as

\[\begin{split}Q_1 = 1/2 + Q_0 + \delta Q \\ \omega_1 = \frac{4\,\pi\,Q_1}{P\,\sqrt{4\,Q_1^2 - 1}} \\ S_1 = \frac{\sigma^2}{(1 + f)\,\omega_1\,Q_1}\end{split}\]

for the primary term, and

\[\begin{split}Q_2 = 1/2 + Q_0 \\ \omega_2 = \frac{8\,\pi\,Q_1}{P\,\sqrt{4\,Q_1^2 - 1}} \\ S_2 = \frac{f\,\sigma^2}{(1 + f)\,\omega_2\,Q_2}\end{split}\]

for the secondary term.

Parameters
  • sigma – The standard deviation of the process.

  • period – The primary period of variability.

  • Q0 – The quality factor (or really the quality factor minus one half; this keeps the system underdamped) for the secondary oscillation.

  • dQ – The difference between the quality factors of the first and the second modes. This parameterization (if dQ > 0) ensures that the primary mode alway has higher quality.

  • f – The fractional amplitude of the secondary mode compared to the primary. This should probably always be 0 < f < 1, but that is not enforced.

class celerite2.terms.Matern32Term(*, sigma, rho, eps=0.01)

A term that approximates a Matern-3/2 function

This term is defined as

\[k(\tau) = \sigma^2\,\left[ \left(1+1/\epsilon\right)\,e^{-(1-\epsilon)\sqrt{3}\,\tau/\rho} \left(1-1/\epsilon\right)\,e^{-(1+\epsilon)\sqrt{3}\,\tau/\rho} \right]\]

with the parameters sigma and rho. The parameter eps controls the quality of the approximation since, in the limit \(\epsilon \to 0\) this becomes the Matern-3/2 function

\[\lim_{\epsilon \to 0} k(\tau) = \sigma^2\,\left(1+ \frac{\sqrt{3}\,\tau}{\rho}\right)\, \exp\left(-\frac{\sqrt{3}\,\tau}{\rho}\right)\]

This term should be used with care since there could be numerical issues with this approximation.

Parameters
  • sigma – The parameter \(\sigma\).

  • rho – The parameter \(\rho\).

  • eps (optional) – The value of the parameter \(\epsilon\).

Operations

These classes encapsulate various operations on terms.

class celerite2.terms.TermSum(*terms)

A sum of multiple Term objects

The resulting kernel function is the sum of the functions and the width of the resulting low-rank representation will be the sum of widths for each of the terms.

Parameters

*terms – Any number of Term subclasses to add together.

class celerite2.terms.TermProduct(term1, term2)

A product of two Term objects

The resulting kernel function is the product of the two functions and the resulting low-rank representation will in general be wider than the sum of the two widths.

Parameters
  • term1 (Term) – The left term.

  • term2 (Term) – The right term.

class celerite2.terms.TermDiff(term)

Take the first derivative of a term with respect to the lag

Parameters

term (Term) – The term to differentiate.

class celerite2.terms.TermConvolution(term, delta)

A term corresponding to the integral of another term over a boxcar

The process produced by this term is equivalent to the process produced by convolving the base process with a boxcar of length delta. This can be useful, for example, when taking exposure time integration into account.

Parameters
  • term (Term) – The term describing the base process.

  • delta (float) – The width of the boxcar filter (for example, the exposure time).

Other models

These models are included for backwards compatibility, but their use is not recommended unless you’re confident that you know what you’re doing.

class celerite2.terms.RealTerm(*, a, c)

The simplest celerite term

Warning

You should only use this term if you really know what you’re doing because it will generally behave poorly. Check out SHOTerm instead!

This term has the form

\[k(\tau) = a_j\,e^{-c_j\,\tau}\]

with the parameters a and c.

Strictly speaking, for a sum of terms, the parameter a could be allowed to go negative but since it is somewhat subtle to ensure positive definiteness, we recommend keeping both parameters strictly positive. Advanced users can build a custom term that has negative coefficients but care should be taken to ensure positivity.

Parameters
  • a – The amplitude of the term.

  • c – The exponent of the term.

class celerite2.terms.ComplexTerm(*, a, b, c, d)

A general celerite term

Warning

You should only use this term if you really know what you’re doing because it will generally behave poorly. Check out SHOTerm instead!

This term has the form

\[k(\tau) = \frac{1}{2}\,\left[(a_j + b_j)\,e^{-(c_j+d_j)\,\tau} + (a_j - b_j)\,e^{-(c_j-d_j)\,\tau}\right]\]

with the parameters a, b, c, and d.

This term will only correspond to a positive definite kernel (on its own) if \(a_j\,c_j \ge b_j\,d_j\).

Parameters
  • a – The real part of amplitude.

  • b – The imaginary part of amplitude.

  • c – The real part of the exponent.

  • d – The imaginary part of exponent.

class celerite2.terms.OriginalCeleriteTerm(term)

A wrapper around terms defined using the original celerite package

Parameters

term (celerite.terms.Term) – The term defined using celerite.