Me? Writing a blog post? Well, you see, it all started with this Wikipedia page on the “Jacobi method for complex Hermitian matrices”. I wanted to apply the Jacobi eigenvalue algorithm to diagonalize complex matrices, since it had worked well for real symmetric matrices, but the page on it only covers real symmetric matrices, not the complex equivalent, Hermitian matrices. I was glad to see there was the additional page supposedly explaining how to do something similar for complex Hermitian matrices, but alas, it’s explained in such complicated, ambiguous terms that it’s at best unreadable, and at worst wrong. I implemented two different interpretations of the page’s content, and both failed on the simplest possible example…

…so I derived my own in a much simpler way, and figured out a more general algorithm. Other people have undoubtedly found the exact same method, but papers I’d found on the topic didn’t seem to state their methods much clearer than the Wikipedia page. The issue is that I derived it in a completely different way than what’s on Wikipedia. It’s not a matter of “fixing a mistake” in the page; by Wikipedia’s rules, this counts as original research (despite that others have probably done the same), so it’s not allowed on Wikipedia. The only other suggestion people had was a blog post to make sure that a correct version is online somewhere, so here it is.

In a nutshell, the Jacobi eigenvalue algorithm works by one-by-one choosing the largest off-diagonal element, and rotating the two vectors associated with that element such that the element becomes zero. Do this enough times and all off-diagonal elements will be quite close to zero (if the matrix is Hermitian), leaving the eigenvalues. The eigenvectors come from applying the rotations to the identity matrix. The only thing necessary to do this is to know the eigenvectors that diagonalize a 2×2 matrix, namely

where is the matrix we’re trying to diagonalize, and is the element we’re trying to eliminate. If is Hermitian, are real, and . However, I won’t assume that is Hermitian, since even if it’s not, this same approach can still be used to get what’s known as the Schur decomposition, namely , where is upper triangular and the eigenvalues of are on the diagonal of . If *is* Hermitian, the Schur decomposition is the same as the eigendecomposition, so . In the non-Hermitian case, it doesn’t give the eigenvectors, but those can be more easily computed afterward. I’ll split this into two cases.

## Case 1:

This case is where you don’t have two diagonal elements that are almost identical compared to the off-diagonal element. If you did, dividing by them could cause chaos, madness, and megalomania… or just invalid results. We start by subtracting from our 2×2 matrix, since we’re only interested in its (pseudo-)eigenvectors , and multiples of can be diagonalized using the same eigenvectors. That basically means that adding/subtracting multiples of won’t change . Then, we divide the 2×2 matrix by , which also doesn’t change the (pseudo-)eigenvectors. This gives us

For simplicity, we’ll just relabel this as

Now, if we want to make by multiplying by a unitary matrix on the right and its conjugate transpose on the right, a bit of trial and error can give you that

is a solution to that problem. Note that the square roots are square roots of complex numbers, not necessarily real numbers. You can verify that this works in the original case by multiplying out and confirming that the lower off-diagonal element is zero. If is Hermitian, both off-diagonal elements should be zero. Case 1 complete.

## Case 2:

This case is where you have two diagonal elements that are almost identical compared to the off-diagonal element. Simple solution: pretend that they are exactly zero. You now have

The zeros make things much simpler, as we can now find

This time, since we’ve assumed that the diagonal elements are exactly the same, it probably won’t make the off-diagonal element exactly zero, but it’ll probably be negligibly small, and if it’s ever the largest at some later point, it’ll just get picked again to be eliminated, only probably without the diagonal elements being so close the 2nd time around.

Again, you don’t have to take my word for it. You can easily verify this solution by just multiplying it out.

## How did you get these?

If you’re wondering how I figured these out (after a weekend of trial and error of different methods), I knew that if the off-diagonal element was non-zero (which it must be, else there’s nothing to do), there is some non-zero component of the second element in each vector of . I temporarily ignored the restriction of having unit vectors and set both of those components to 1, but didn’t ignore the restriction of having the two vectors being complex orthogonal (i.e. ), giving me

I know I want that off-diagonal element zero, so

Since the scales of the 2 vectors are arbitrary, including the relative scales, I scaled one by , and the other by . Then I just normalize them both (they happened to be the same length, ), and that’s the solution above. I did the same for case 2, only it was even simpler. Easy… it just took a weekend to figure out what the easy way was.

## Final Algorithm to Compute Schur Decomposition

V = identity matrix
while (there is an ij such that i>j and |M[i][j]| > 10^-8*|M[i][i]-M[j][j]|) {
select ij with maximum |M[i][j]|/|M[i][i]-M[j][j]|, possibly infinite
// Case 1
if (|M[i][j]| < 10^14*|M[i][i]-M[j][j]|) {
a = 2*M[i][j]/(M[i][i]-M[j][j])
b = 2*M[j][i]/(M[i][i]-M[j][j])
c = sqrt(1+a*b)
L = sqrt(|a|^2 + |1+c|^2)
Vsub = {{1+c, -conj(a)}, {a, 1+conj(c)}}/L
}
// Case 2
else {
a = M[i][j]
b = M[j][i]
L = sqrt(|a|+|b|)
Vsub = {{sqrt(b), -conj(sqrt(a))}, {sqrt(a), conj(sqrt(b))}}/L
}
// Rotate columns i and j of M
for (each row of M) {
newi = M[row][i]*Vsub[0][0]+M[row][j]*Vsub[1][0]
newj = M[row][i]*Vsub[0][1]+M[row][j]*Vsub[1][1]
M[row][i] = newi
M[row][j] = newj
}
// Rotate rows i and j of M
for (each column of M) {
newi = M[i][col]*conj(Vsub[0][0])+M[j][col]*conj(Vsub[1][0])
newj = M[i][col]*conj(Vsub[0][1])+M[j][col]*conj(Vsub[1][1])
M[i][col] = newi
M[j][col] = newj
}
// Rotate cols i and j of V
for (each row of V) {
newi = V[row][i]*Vsub[0][0]+V[row][j]*Vsub[1][0]
newj = V[row][i]*Vsub[0][1]+V[row][j]*Vsub[1][1]
V[row][i] = newi
V[row][j] = newj
}
}
// Diagonal of M now has the eigenvalues of the input M
// If M was Hermitian, V now has the eigenvectors of the input M as columns,
// in the order that the eigenvalues appear in M

Posted in Math, Programming