theme | title | author | institute | date | toc | slide-level |
---|---|---|---|---|---|---|
solarized |
Numerical Reading Group: Eigendecompositions |
Richard Shaw |
UBC |
Mar 31st 2021 |
false |
2 |
$$ \newcommand{\ensuremath}[1]{#1} \newcommand\pdiff[1]{\frac{\partial}{\partial #1}} \newcommand\diff[2]{\frac{d #1}{d #2}} \newcommand{\curl}{,\mbox{curl},} \newcommand\del{\nabla} \newcommand{\grad}[1][]{\ensuremath{\mathbf{\nabla} #1}}
\newcommand{\la}{\ensuremath{\left\langle}} \newcommand{\ra}{\ensuremath{\right\rangle}}
\newcommand{\lv}{\ensuremath{\left\lvert}} \newcommand{\rv}{\ensuremath{\right\rvert}}
\newcommand{\ls}{\ensuremath{\left[}} \newcommand{\rs}{\ensuremath{\right]}}
\newcommand{\lp}{\ensuremath{\left(}} \newcommand{\rp}{\ensuremath{\right)}}
\newcommand{\vecr}[1]{\mathbf{#1}} \newcommand{\mat}[1]{\mathbf{#1}}
\newcommand{\hconj}{\ensuremath{\dagger}}
\newcommand{\va}{\vecr{a}} \newcommand{\vc}{\vecr{c}} \newcommand{\vf}{\vecr{f}} \newcommand{\vk}{\vecr{k}} \newcommand{\vn}{\vecr{n}} \newcommand{\vr}{\vecr{r}} \newcommand{\vs}{\vecr{s}} \newcommand{\vt}{\vecr{t}} \newcommand{\vu}{\vecr{u}} \newcommand{\vv}{\vecr{v}} \newcommand{\vw}{\vecr{w}} \newcommand{\vx}{\vecr{x}} \newcommand{\vy}{\vecr{y}}
\newcommand{\vdhat}{\hat{\vecr{d}}} \newcommand{\vnhat}{\hat{\vecr{n}}} \newcommand{\vphat}{\hat{\vecr{p}}} \newcommand{\vuhat}{\hat{\vecr{u}}} \newcommand{\vvhat}{\hat{\vecr{v}}} \newcommand{\vxhat}{\hat{\vecr{x}}} \newcommand{\vyhat}{\hat{\vecr{y}}} \newcommand{\vzhat}{\hat{\vecr{z}}}
\newcommand{\vnt}{\tilde{\vecr{n}}} \newcommand{\vvt}{\tilde{\vecr{v}}} \newcommand{\vxt}{\tilde{\vecr{x}}}
\newcommand{\vnb}{\bar{\vecr{n}}} \newcommand{\vvb}{\bar{\vecr{v}}}
\newcommand{\vA}{\vecr{A}} \newcommand{\vE}{\vecr{E}} \newcommand{\vH}{\vecr{H}} \newcommand{\vS}{\vecr{S}}
\newcommand{\veps}{\vecr{\varepsilon}}
\newcommand{\mA}{\mat{A}} \newcommand{\mB}{\mat{B}} \newcommand{\mC}{\mat{C}} \newcommand{\mE}{\mat{E}} \newcommand{\mF}{\mat{F}} \newcommand{\mG}{\mat{G}} \newcommand{\mI}{\mat{I}} \newcommand{\mJ}{\mat{J}} \newcommand{\mK}{\mat{K}} \newcommand{\mL}{\mat{L}} \newcommand{\mM}{\mat{M}} \newcommand{\mN}{\mat{N}} \newcommand{\mP}{\mat{P}} \newcommand{\mQ}{\mat{Q}} \newcommand{\mR}{\mat{R}} \newcommand{\mS}{\mat{S}} \newcommand{\mT}{\mat{T}} \newcommand{\mU}{\mat{U}} \newcommand{\mV}{\mat{V}} \newcommand{\mW}{\mat{W}} \newcommand{\mX}{\mat{X}} \newcommand{\mY}{\mat{Y}}
\newcommand{\mNh}{\mat{N}^{\scriptscriptstyle -\frac{1}{2}}}
\newcommand{\mBt}{\tilde{\mat{B}}} \newcommand{\mCt}{\tilde{\mat{C}}} \newcommand{\mNt}{\tilde{\mat{N}}} \newcommand{\mQt}{\tilde{\mat{Q}}} \newcommand{\mSt}{\tilde{\mat{S}}}
\newcommand{\mGamma}{\mat{\Gamma}} \newcommand{\mLambda}{\mat{\Lambda}} \newcommand{\mLambdat}{\tilde{\mat{\Lambda}}} \newcommand{\mSigma}{\mat{\Sigma}}
\newcommand{\mBb}{\bar{\mat{B}}} \newcommand{\mCb}{\bar{\mat{C}}} \newcommand{\mFb}{\bar{\mat{F}}} \newcommand{\mNb}{\bar{\mat{N}}} \newcommand{\mSb}{\bar{\mat{S}}}
\newcommand{\mChat}{\hat{\mat{C}}}
\newcommand{\brsc}[1]{{\ensuremath{\scriptscriptstyle (#1)}}}
\newcommand{\calF}{\mathcal{F}} \newcommand{\calG}{\mathcal{G}} \newcommand{\calH}{\mathcal{H}} \newcommand{\calP}{\mathcal{P}} \newcommand{\calW}{\mathcal{W}}
\newcommand{\tmax}{\ensuremath{\text{max}}} \newcommand{\tmin}{\ensuremath{\text{min}}}
\newcommand{\kprod}{\otimes}
\newcommand{\dhn}{d^2\hat{n}}
\newcommand{\hMpc}{;h^{-1}:\mathrm{Mpc}} \newcommand{\ihMpc}{;h:\mathrm{Mpc}^{-1}}
\newcommand{\eps}{\varepsilon}
\newcommand{\prob}[1]{\mathrm{Pr}(#1)}
\mA $$
- Eigenvalue methods
- Singular Value Decompositions
- Numerical precision
- Iterative methods
- Operations required
$\sim O(M N_\text{ev} N_\text{mv})$ - Useful when you can evaluate a matrix vector multiplication quickly i.e,
$N_\text{mv} \ll N^2$ , e.g. sparse matrix - ... or you only want a few eigenvalues/vectors, i.e.
$N_\text{ev} \ll N$
- Operations required
- Dense methods
- Faster if you want all the eigenvalues/vectors and the matrix is dense
- What is done by
la.eigh
- Use eigendecomposition to write a matrix as
$$\mA = \mX \mLambda \mX^\hconj$$ with$\mX$ an orthogonal/unitary matrix of eigenvectors and$\mLambda$ the diagonal array of eigenvalues - As a transformation matrix we can think of this as
$$\mA \vv = \mX \cdot (\mLambda \cdot (\mX^\hconj \vv))$$ i.e.- Extract the components of
$\vv$ along each eigenvector - Scale each component by the corresponding eigenvalue
- Add components up along the eigenvectors to get the final vector
- Extract the components of
- Can also view as a sum of rank-1 outer products
$$\mA = \sum_{i=0}^{N-1} \lambda_i , \vx_i \otimes \vx_i^\hconj$$ - We can view a matrix as a sum of components ranked by their eigenvalue. Higher eigenvalues are more important to this matrix
- If we truncate the sum at
$M \ll N$ to exclude small eigenvalues this is making a low-rank approximation$$\mA = \sum_{i=0}^{M-1} \lambda_i , \vx_i \otimes \vx_i^\hconj$$
- Generalisation of the eigendecomposition to non-square, non-symmetric/hermitian matrices
$$\mA = \mU \mSigma \mV^\hconj$$ - Nomenclature:
- Diagonal entries of
$\mSigma$ are the singular values - Columns of
$\mU$ are left singular vectors - Columns of
$\mV$ are right singular vectors
- Diagonal entries of
- This is very frequently used in CHIME
- Both interpretations still apply, either as transformation matrix, or as a general way of breaking up into rank-1 terms
- Happens in CHIME frequently:
- Calibration is looking for a rank-1 (or rank-2) approximation to the visibility matrix (see Kotekan
EigenVisIter
) - SVD foreground removal searches for a low rank approximation to the sky as it is dominated by foregrounds (draco
svdfilter.py
)
- Calibration is looking for a rank-1 (or rank-2) approximation to the visibility matrix (see Kotekan
- In both cases we suffer from missing data
- Short baselines in the vis matrix are contaminated by cross talk
- Data has missing time/frequency samples for foregrounds
- Solution is an iterative scheme (~expectation-maximisation)
- Initialise by zeroing out the bad data
- Find a low-rank approximation
- Fill the missing entries with the low rank solution
- Iterate
- A very common class of matrix used in CHIME/Cosmology/Physics
- For some vector of random variables
$\vx$ the covariance matrix$\mC$ is$$[\mC]_{ij} = \text{Cov}(x_i, x_j)$$ and describes the correlations between all the different components of$\vx$ - This can be written in a more matrix-y notation as
$$\mC = \la (\vx - \bar{\vx})(\vx - \bar{\vx})^\hconj \ra$$ - Covariance matrices are semi-positive-definite, i.e. all eigenvalues
$\geq 0$
- For simulations we frequently need to generate samples from a multi-variate Gaussian
- described by mean
$\bar{\vx}$ and covariance$\mC$
- described by mean
- Use for generating sky maps (
cora.core.skysim
), Gain fluctuations (draco.synthesis.gain
) ... - How can we do this?
- Generating standard Gaussian samples (i.e. zero-mean, unit variance, uncorrelated) is easy
- Linear combinations of Gaussian variables are still Gaussian
- Let us define
$\vs$ as a vector of standard Gaussian random samples. The most general linear transform we can do is$$\hat{\vx} = \va + \mA \vs$$ - Requiring
$\la \hat{\vx} \ra = \bar{\vx}$ and noting that$\la \vs \ra = 0$ , we see that$\va = \bar{\vx}$ - Let us look at the covariance of
$\hat{\vx}$ $$\begin{aligned}\hat{\mC} &= \la (\hat{\vx} - \bar{\vx}) (\hat{\vx} - \bar{\vx})^\hconj \ra \ & = \mA \la \vs \vs^\hconj \ra \mA^\hconj \ & = \mA \mA^\hconj \end{aligned} $$
- To produce random samples we need to find
$\mA$ such that$\mA \mA^\hconj = \mC$ - One option is to use the Cholesky decomposition which factorizes a positive-definite matrix
$$\mC = \mL \mL^\hconj$$ where$\mL$ is lower triangular. This is essentially a restricted version of the LU decomposition - Another option is to use the Eigendecomposition
$$\mC = \mX \mLambda \mX^\hconj$$ Define$\mA = \mX \mLambda^{1/2}$ with$\mLambda^{1/2}$ being diagonal with the root of the eigenvalues
- Almost all matrix-matrix operations are
$O(N^3)$ , but the constant factors matter in practice- Eigenvalues only
$\sim 4 N^3 / 3$ FLOPS - Eigenvalues + eigenvectors by QR
$\sim 9 N^3$ FLOPS - Eigenvalues + eigenvectors by Divide and Conquer
$\sim 4 N^3$ FLOPS - Cholesky
$\sim N^3 / 3$ FLOPS
- Eigenvalues only
- Cholesky is preferred.... or is it...
- See Notebook
- See
cora.util.nputil.matrix_root_manynull
- Cholesky is so cheap that if you can do it, it pays off, always worth trying
- If not fall back to Eigenvalue method
- Set small eigenvalues (
$\lesssim 10^{-13} \lambda_\text{max}$ ) to zero explicitly
- Set small eigenvalues (