Python interface¶
The primary interface to computations using celerite2 are provided by the
celerite2.GaussianProcess
class that is documented below. These
calculations will be performed using Term
models as documented in the
Model building section. See the TBD tutorial for examples of how to use
this interface.
- class celerite2.GaussianProcess(kernel, t=None, *, mean=0.0, **kwargs)[source]¶
- apply_inverse(y, *, inplace=False)¶
Apply the inverse of the covariance matrix to a vector or matrix
Solve
K.x = y
forx
whereK
is the covariance matrix of the GP.Note
The mean function is not applied in this method.
- Parameters:
y (shape[N] or shape[N, M]) – The vector or matrix
y
described above.inplace (bool, optional) – If
True
,y
will be overwritten with the resultx
.
- Raises:
RuntimeError – If
GaussianProcess.compute()
is not called first.ValueError – When the inputs are not valid (shape, number, etc.).
- compute(t, *, yerr=None, diag=None, check_sorted=True, quiet=False)¶
Compute the Cholesky factorization of the GP covariance matrix
- Parameters:
t (shape[N]) – The independent coordinates of the observations. This must be sorted in increasing order.
yerr (shape[N], optional) – If provided, the diagonal standard deviation of the observation model.
diag (shape[N], optional) – If provided, the diagonal variance of the observation model.
check_sorted (bool, optional) – If
True
, a check is performed to make sure thatt
is correctly sorted. AValueError
will be thrown when this check fails.quiet (bool, optional) – If
True
, when the matrix cannot be factorized (because of numerics or otherwise) the solver’sLinAlgError
will be silenced and the determiniant will be set to zero. Otherwise, the exception will be propagated.
- Raises:
ValueError – When the inputs are not valid (shape, number, etc.).
LinAlgError – When the matrix is not numerically positive definite.
- dot_tril(y, *, inplace=False)¶
Dot the Cholesky factor of the GP system into a vector or matrix
Compute
x = L.y
whereK = L.L^T
andK
is the covariance matrix of the GP.Note
The mean function is not applied in this method.
- Parameters:
y (shape[N] or shape[N, M]) – The vector or matrix
y
described above.inplace (bool, optional) – If
True
,y
will be overwritten with the resultx
.
- Raises:
RuntimeError – If
GaussianProcess.compute()
is not called first.ValueError – When the inputs are not valid (shape, number, etc.).
- log_likelihood(y, *, inplace=False)¶
Compute the marginalized likelihood of the GP model
The factorized matrix from the previous call to
GaussianProcess.compute()
is used so that method must be called first.- Parameters:
y (shape[N]) – The observations at coordinates
t
as defined byGaussianProcess.compute()
.inplace (bool, optional) – If
True
,y
will be overwritten in the process of the calculation. This will reduce the memory footprint, but should be used with care since this will overwrite the data.
- Raises:
RuntimeError – If
GaussianProcess.compute()
is not called first.ValueError – When the inputs are not valid (shape, number, etc.).
- predict(y, t=None, *, return_cov=False, return_var=False, include_mean=True, kernel=None)¶
Compute the conditional distribution
The factorized matrix from the previous call to
GaussianProcess.compute()
is used so that method must be called first.- Parameters:
y (shape[N]) – The observations at coordinates
t
as defined byGaussianProcess.compute()
.t (shape[M], optional) – The independent coordinates where the prediction should be evaluated. If not provided, this will be evaluated at the observations
t
fromGaussianProcess.compute()
.return_var (bool, optional) – Return the variance of the conditional distribution.
return_cov (bool, optional) – Return the full covariance matrix of the conditional distribution.
include_mean (bool, optional) – Include the mean function in the prediction.
kernel (optional) – If provided, compute the conditional distribution using a different kernel. This is generally used to separate the contributions from different model components. Note that the computational cost and scaling will be worse when using this parameter.
- Raises:
RuntimeError – If
GaussianProcess.compute()
is not called first.ValueError – When the inputs are not valid (shape, number, etc.).
- recompute(*, quiet=False)¶
Re-compute the factorization given a previous call to compute
- Parameters:
quiet (bool, optional) – If
True
, when the matrix cannot be factorized (because of numerics or otherwise) the solver’sLinAlgError
will be silenced and the determiniant will be set to zero. Otherwise, the exception will be propagated.- Raises:
RuntimeError – If
GaussianProcess.compute()
is not called first.ValueError – When the inputs are not valid (shape, number, etc.).
- sample(*, size=None, include_mean=True)[source]¶
Generate random samples from the prior implied by the GP system
The factorized matrix from the previous call to
GaussianProcess.compute()
is used so that method must be called first.- Parameters:
size (int, optional) – The number of samples to generate. If not provided, only one sample will be produced.
include_mean (bool, optional) – Include the mean function in the prediction.
- Raises:
RuntimeError – If
GaussianProcess.compute()
is not called first.ValueError – When the inputs are not valid (shape, number, etc.).
Model building¶
Note
If you’re not sure which model to use, check out the
Recommended models section and, in particular, the SHOTerm
.
The following is the abstract base class for all the models provided by celerite2 and all of these methods will be implemented by subclasses. All of these models can be composed using addition and multiplication. For example,
from celerite2 import terms
term1 = terms.SHOTerm(sigma=1.0, w0=0.5, Q=2.5)
term2 = terms.SHOTerm(sigma=0.5, w0=0.5, Q=0.2)
term = term1 + term2
# or ...
term = term1 * term2
- class celerite2.terms.Term[source]¶
The abstract base “term” that is the superclass of all other terms
Subclasses should overload at least the
Term.get_coefficients()
method.- dot(x, diag, y)[source]¶
Apply a matrix-vector or matrix-matrix product
- Parameters:
x (shape[N]) – The independent coordinates of the data.
diag (shape[N]) – The diagonal variance of the system.
y (shape[N] or shape[N, K]) – The target of vector or matrix for this operation.
- get_celerite_matrices(x, diag, *, c=None, a=None, U=None, V=None)[source]¶
Get the matrices needed to solve the celerite system
Pre-allocated arrays can be provided to the Python interface to be re-used for multiple evaluations.
Note
In-place operations are not supported by the modeling extensions.
- Parameters:
x (shape[N]) – The independent coordinates of the data.
diag (shape[N]) – The diagonal variance of the system.
a (shape[N], optional) – The diagonal of the A matrix.
U (shape[N, J], optional) – The first low-rank matrix.
V (shape[N, J], optional) – The second low-rank matrix.
P (shape[N-1, J], optional) – The regularization matrix used for numerical stability.
- Raises:
ValueError – When the inputs are not valid.
- get_coefficients()[source]¶
Compute and return the coefficients for the celerite model
This should return a 6 element tuple with the following entries:
(ar, cr, ac, bc, cc, dc)
Note
All of the returned objects must be arrays, even if they only have one element.
- get_psd(omega)[source]¶
Compute the value of the power spectral density for this process
- Parameters:
omega (shape[...]) – The (angular) frequencies where the power should be evaluated.
Recommended models¶
These are the best models to use for most datasets.
- class celerite2.terms.SHOTerm(*, eps=1e-05)[source]¶
A term representing a stochastically-driven, damped harmonic oscillator
The PSD of this term is
\[S(\omega) = \sqrt{\frac{2}{\pi}} \frac{S_0\,\omega_0^4} {(\omega^2-{\omega_0}^2)^2 + {\omega_0}^2\,\omega^2/Q^2}\]with the parameters
S0
,Q
, andw0
. Consult the Celerite paper or its preprint on arXiv for a more detailed mathematical description.This implementation also supports the following reparameterizations that can be easier to use and reason about:
rho
, the undamped period of the oscillator, defined as \(\rho = 2\,\pi / \omega_0\),tau
, the damping timescale of the process, defined as \(\tau = 2\,Q / \omega_0\), andsigma
, the standard deviation of the process, defined as \(\sigma = \sqrt{S_0\,\omega_0\,Q}\).
- Parameters:
w0 – The undamped angular frequency, \(\omega_0\) above.
rho – Alternative parameterization for
w0
as described above.Q – The quality factor, \(Q\) above.
tau – Alternative parameterization for
Q
as described above.S0 – \(S_0\) in the PSD above. The power at \(\omega = 0\) is \(S_0\sqrt{2/\pi}\).
sigma – Alternative parameterization for
S0
as described above.eps (optional) – A regularization parameter used for numerical stability when computing \(\sqrt{1-4\,Q^2}\) or \(\sqrt{4\,Q^2-1}\).
- class celerite2.terms.RotationTerm(*, sigma, period, Q0, dQ, f)[source]¶
A mixture of two SHO terms that can be used to model stellar rotation
This term has two modes in Fourier space: one at
period
and one at0.5 * period
. This can be a good descriptive model for a wide range of stochastic variability in stellar time series from rotation to pulsations.More precisely, the parameters of the two
SHOTerm
terms are defined as\[\begin{split}Q_1 = 1/2 + Q_0 + \delta Q \\ \omega_1 = \frac{4\,\pi\,Q_1}{P\,\sqrt{4\,Q_1^2 - 1}} \\ S_1 = \frac{\sigma^2}{(1 + f)\,\omega_1\,Q_1}\end{split}\]for the primary term, and
\[\begin{split}Q_2 = 1/2 + Q_0 \\ \omega_2 = \frac{8\,\pi\,Q_1}{P\,\sqrt{4\,Q_1^2 - 1}} \\ S_2 = \frac{f\,\sigma^2}{(1 + f)\,\omega_2\,Q_2}\end{split}\]for the secondary term.
- Parameters:
sigma – The standard deviation of the process.
period – The primary period of variability.
Q0 – The quality factor (or really the quality factor minus one half; this keeps the system underdamped) for the secondary oscillation.
dQ – The difference between the quality factors of the first and the second modes. This parameterization (if
dQ > 0
) ensures that the primary mode alway has higher quality.f – The fractional amplitude of the secondary mode compared to the primary. This should probably always be
0 < f < 1
, but that is not enforced.
- class celerite2.terms.Matern32Term(*, sigma, rho, eps=0.01)[source]¶
A term that approximates a Matern-3/2 function
This term is defined as
\[k(\tau) = \sigma^2\,\left[ \left(1+1/\epsilon\right)\,e^{-(1-\epsilon)\sqrt{3}\,\tau/\rho} \left(1-1/\epsilon\right)\,e^{-(1+\epsilon)\sqrt{3}\,\tau/\rho} \right]\]with the parameters
sigma
andrho
. The parametereps
controls the quality of the approximation since, in the limit \(\epsilon \to 0\) this becomes the Matern-3/2 function\[\lim_{\epsilon \to 0} k(\tau) = \sigma^2\,\left(1+ \frac{\sqrt{3}\,\tau}{\rho}\right)\, \exp\left(-\frac{\sqrt{3}\,\tau}{\rho}\right)\]This term should be used with care since there could be numerical issues with this approximation.
- Parameters:
sigma – The parameter \(\sigma\).
rho – The parameter \(\rho\).
eps (optional) – The value of the parameter \(\epsilon\).
Operations¶
These classes encapsulate various operations on terms.
- class celerite2.terms.TermSum(*terms)[source]¶
A sum of multiple
Term
objectsThe resulting kernel function is the sum of the functions and the width of the resulting low-rank representation will be the sum of widths for each of the terms.
- Parameters:
*terms – Any number of
Term
subclasses to add together.
- class celerite2.terms.TermProduct(term1, term2)[source]¶
A product of two
Term
objectsThe resulting kernel function is the product of the two functions and the resulting low-rank representation will in general be wider than the sum of the two widths.
- class celerite2.terms.TermDiff(term)[source]¶
Take the first derivative of a term with respect to the lag
- Parameters:
term (Term) – The term to differentiate.
- class celerite2.terms.TermConvolution(term, delta)[source]¶
A term corresponding to the integral of another term over a boxcar
The process produced by this term is equivalent to the process produced by convolving the base process with a boxcar of length
delta
. This can be useful, for example, when taking exposure time integration into account.- Parameters:
term (Term) – The term describing the base process.
delta (float) – The width of the boxcar filter (for example, the exposure time).
Other models¶
These models are included for backwards compatibility, but their use is not recommended unless you’re confident that you know what you’re doing.
- class celerite2.terms.RealTerm(*, a, c)[source]¶
The simplest celerite term
Warning
You should only use this term if you really know what you’re doing because it will generally behave poorly. Check out
SHOTerm
instead!This term has the form
\[k(\tau) = a_j\,e^{-c_j\,\tau}\]with the parameters
a
andc
.Strictly speaking, for a sum of terms, the parameter
a
could be allowed to go negative but since it is somewhat subtle to ensure positive definiteness, we recommend keeping both parameters strictly positive. Advanced users can build a custom term that has negative coefficients but care should be taken to ensure positivity.- Parameters:
a – The amplitude of the term.
c – The exponent of the term.
- class celerite2.terms.ComplexTerm(*, a, b, c, d)[source]¶
A general celerite term
Warning
You should only use this term if you really know what you’re doing because it will generally behave poorly. Check out
SHOTerm
instead!This term has the form
\[k(\tau) = \frac{1}{2}\,\left[(a_j + b_j)\,e^{-(c_j+d_j)\,\tau} + (a_j - b_j)\,e^{-(c_j-d_j)\,\tau}\right]\]with the parameters
a
,b
,c
, andd
.This term will only correspond to a positive definite kernel (on its own) if \(a_j\,c_j \ge b_j\,d_j\).
- Parameters:
a – The real part of amplitude.
b – The imaginary part of amplitude.
c – The real part of the exponent.
d – The imaginary part of exponent.