Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug getting multiple degenerate eigenvalues #38

Closed
mtfishman opened this issue Aug 9, 2020 · 7 comments
Closed

Bug getting multiple degenerate eigenvalues #38

mtfishman opened this issue Aug 9, 2020 · 7 comments

Comments

@mtfishman
Copy link

Hi Jutho,

Thanks for the nice package. I wanted to point out a small bug I came across getting multiple degenerate eigenvalues:

using KrylovKit
using LinearAlgebra

eigs = [1; 1; 0; 0]
U = qr(randn(4, 4)).Q
M = U*Diagonal(eigs)*U'
@show eigsolve(Hermitian(M), 2, :LM)
@show eigsolve(Hermitian(M), 2, :SR)

results in:

eigsolve(Hermitian(M), 2, :LM) = ([1.0000000000000004, -5.204170427930421e-18], [[-0.7528788248393644, -0.11875011251302399, 0.4496317483253351, 0.4657286514533373], [-0.45690159127657226, -0.5426827326209265, -0.21223585437236434, -0.6720805976390425]], ConvergenceInfo: 2 converged values after 1 iterations and 2 applications of the linear map;
norms of residuals are given by (2.5979283617878776e-16, 2.3812210978860583e-17).
)
eigsolve(Hermitian(M), 2, :SR) = ([2.7755575615628914e-17, 1.0000000000000004], [[-0.4451190535974025, -0.5638946680346222, -0.20606235129617, -0.6644020912557167], [0.5867198946724742, 0.19434763988529719, 0.3940429084822651, -0.6802344789419107]], ConvergenceInfo: 2 converged values after 1 iterations and 2 applications of the linear map;
norms of residuals are given by (1.5430643551413483e-16, 1.960324084219161e-16).
)

Asking for 1, 3 or 4 eigenvectors appears to work. This is on v0.5.2.

Nothing urgent, thought I'd bring it to your attention.

-Matt

@Jutho
Copy link
Owner

Jutho commented Aug 9, 2020

Not sure if this is really incorrect from a mathematical point of view. If you have only two different eigenspaces, then a random starting vector will have support in both, i.e. it can be written as a linear combination of a vector in the one eigenspace and a vector in the other eigenspace. But those vectors are eigenvectors, and hence, another linearly independent eigenvector in those same eigenspaces can in theory never appear in the Krylov subspace of that one starting vector. In this particular case, after applying the operator once, an invariant subspace is detected and so the algorithm terminates.

This is of course a consequence of KrylovKit working with arbitrary objects instead of just Vectors. It does not rely on length or any kind of dimension information. So if it detects an invariant subspace, it has no way of distinguishing the cases where this is really the full dimension of the problem, or indeed just a subspace and there are other linearly independent directions. Also, it wouldn't know how to create a new random vector that might be have support in those linearly independent directions.

I am actually suprised that it works when asking for 3 or 4 eigenvalues. It probably ramps up the almost zero residual after the first operator application (which should theoretically be zero, and so probably has components of order machine precision), because it then tries to grow the Krylov dimension even further.

So note that, in theory, a krylov method starting from one random vector can never find several degenerate eigenvectors, it only finds one eigenvector in every distinct eigenspace, corresponding to that vector on which the starting vector was supported. In larger problems, however, this should be less of an issue, as typically loss of precision during the orthogonalization leads to new linearly independent directions in degenerate eigenspaces being generated. But with small test cases like these, this problem shows up.

@mtfishman
Copy link
Author

I see. You say that KrylovKit wouldn't know how to create a new random vector, but at least for the interface I used in the first post, I don't provide an initial vector. Doesn't that mean KrylovKit is generating one internally, I guess from the size of the input matrix? At least for AbstractMatrix types, it seems like you could use the size to detect that not all vectors of the subspace were found, and maybe eventually there could be an optional size overload that people could provide for their own custom types to help detect situations like this.

At the very least, I found it a bit surprising, since naively I would expect that using :LM and searching for at least 2 eigenvectors would have found 2 eigenvectors with eigenvalue 1. If I had input that matrix without knowing the eigenvalues, I guess I would have had to run eigsolve twice to see that after two runs it output two linearly independent vectors (or requested more eigenvalues). It may be helpful to document this behavior.

Also, I compared a larger problem to Arpack.jl, which handles this case in the way I would have expected:

using Arpack
using KrylovKit
using LinearAlgebra

n = 500
d = [i < n ÷ 2 ? 1 : 0 for i in 1:n]
U = qr(randn(n, n)).Q
M = Hermitian(U*Diagonal(d)*U')

λ_kk,_ = eigsolve(M, 2, :LM, ishermitian=true)
@show λ_kk

λ_ar,_ = eigs(M, nev=2, which=:LM)
@show λ_ar

results in:

λ_kk = [1.0, -2.7755575615628914e-17]
λ_ar = [1.0000000000000018, 1.000000000000001]

And as you anticipated, for this case when more than 2 eigenvalues are requested, it stops after finding 2 (one of eigenvalue 1 and one of eigenvalue 0). If I add noise to the eigenvalues of the form eps * randn(n), for eps >= 1e-12 it breaks the degeneracy enough to find multiple eigenvalues near 1.

This example seems a bit contrived, but it came up when diagonalizing a free fermion correlation matrix, where at half filling it would be a Hermitian matrix with half the eigenvalues 0 and half the eigenvalues 1.

@Jutho
Copy link
Owner

Jutho commented Aug 10, 2020

I see. You say that KrylovKit wouldn't know how to create a new random vector, but at least for the interface I used in the first post, I don't provide an initial vector. Doesn't that mean KrylovKit is generating one internally, I guess from the size of the input matrix? At least for AbstractMatrix types, it seems like you could use the size to detect that not all vectors of the subspace were found, and maybe eventually there could be an optional size overload that people could provide for their own custom types to help detect situations like this.

Yes, in the eigsolve interface, the case of an AbstractMatrix is special cased and a random starting vector of the correct size is generated. But once inside the routines that construct the Krylov subspace etc, no distinction is made between AbstractMatrix and general linear operators implemented via functions. In fact, a matrix is wrapped in a function such that you can write A(v) instead of A*v.

Arpack(.jl) is very different, as it only works with Vector objects and so it does indeed now the size, and can construct a new vector if the residual at some point goes to zero.

At the very least, I found it a bit surprising, since naively I would expect that using :LM and searching for at least 2 eigenvectors would have found 2 eigenvectors with eigenvalue 1. If I had input that matrix without knowing the eigenvalues, I guess I would have had to run eigsolve twice to see that after two runs it output two linearly independent vectors (or requested more eigenvalues). It may be helpful to document this behavior.

Nonetheless, that the theoretical behaviour if you know about Krylov methods. But again, typically you don't run into this. If you have many different eigenvalues, of which e.g. the largest is two-fold degenerate, then by orthogonalizing many times and restarting etc, you will at some point generate a contribution along a second linearly independent eigenvector corresponding to the largest eigenvalue.

This example seems a bit contrived, but it came up when diagonalizing a free fermion correlation matrix, where at half filling it would be a Hermitian matrix with half the eigenvalues 0 and half the eigenvalues 1.

Yes this case will likely always fail, though I am wondering what you were looking for in this case. Getting a few eigenvectors of the eigenvalue 1 subspace does not seem very meaningful: which ones you get will be completely random, and none of them are more important than the others. Or maybe you want the full eigenspace of eigenvalue 1, but then I don't think an iterative eigensolver is really what you want to use.

@mtfishman
Copy link
Author

This came up in the process of implementing the algorithm in https://arxiv.org/abs/1504.07701. That algorithm is based on successively diagonalizing subblocks of the correlation matrix. As the subblocks get large enough, you will find many eigenvalues very close to 0 or 1. The simple version of the algorithm is based on isolating a single eigenvalue that is closest to 0 or 1, however modifications may involve isolating some subset of eigenvalues closest to 0 or 1. It is nice to not have to diagonalize the entire subblock, though generally the blocks are not very large and that is fine to do. I was attempting to use eigsolve to get a small subset of the largest eigenvalues (close to 1) and the smallest eigenvalues (closest to 0), and the behavior of eigsolve was just a bit confusing (you could imagine in that case, depending on the block size the first eigenvalue is consistently the largest or smallest, but the second one isn't always the second or largest, respectively). I was thinking of eigsolve(M, 2, :LM) as a "block box" to get the two largest eigenvalues, regardless of details about numerical degeneracies, etc. With the current behavior, it just requires a bit more work on the user's end (say by asking for some subspace, seeing if I get what I expect, and then running it again with a different starting vector to see if I get new linearly independent eigenvectors).

Anyway, now that you explain the behavior is based on KrylovKit only using a single random starting vector, the behavior makes sense. It still seems like there could be a fairly generic way to optionally handle this situation, for example by having a mode where the user passes two starting vectors or taking the starting vector and checking for something like a randn!, setindex!, or map! overload, if it is not an AbstractVector. It could be helpful in the docs to point out that if the dominant subspace is degenerate, KrylovKit may output only a subset of the degenerate eigenvectors (I can add that if you think that is useful). Otherwise, feel free to close.

@Jutho
Copy link
Owner

Jutho commented Aug 10, 2020

It's probably useful to add some warning/disclaimer, I agree. I think this question came up before.

Also, I am more than happy to accept PRs that add block algorithms or more flexibility to add several starting vectors etc. As you know, designing Krylov methods is not my day job so I am by no means an expert, nor do I have the time to look into how this should be done. For linear problems, I have been willing to extend GMRES to include several vectors, which seems not too difficult. There is a method called LGRMES, which is available in Python, which supports this, and also has better convergence overall.

But I don't know if similar developments exist for eigenvalue problems. Another thing would be a Davidson algorithm, which I believe you guys have experience with. I don't know how it works exactly and what you need for it, i.e. whether it also fits in this generic paradigm of applying linear functions to abstract vectors.

@mtfishman
Copy link
Author

No worries, I'm also not an expert on these methods and it seems like there are endless variations that could be implemented, so I generally like the strategy of keeping them simple. We have a few of these methods implemented in the C++ version of ITensor, but there are always corner cases cropping up that cause problems and I've found KrylovKit to be very fast and reliable. I think a warning in the docs is sufficient here.

@Jutho
Copy link
Owner

Jutho commented Mar 16, 2022

I finally added the requested comment/documentation, so I will close this unless there are other concerns.

@Jutho Jutho closed this as completed Mar 16, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants