We consider the problem of efficiently computing the infinity norm of the inverse of a lower bidiagonal matrix L in linear time.
Table Of Content
The infinity norm of a matrix is just the absolute maximum row sum of that matrix, or alternatively
so computing the norm shouldn’t be all that difficult. For a full-matrix , it should take approximately time to compute the sum of a single row, and such computations to get the maximum row sum for a total of time.
If is also lower bidiagonal, then it would have the shape
that is, the diagonal and the first lower-subdiagonal are filled, but everywhere else are zero. Being bidiagonal grants the matrix a lot of nice properties; for one thing, it only takes operations to get the row sum now, so it only takes time to find the infinity norm of a bidiagonal matrix. Similarly, as we shall see later, it also only takes time to solve a system of linear equation of the form
if is bidiagonal.
2. The Problem
Suppose you’re given a lower-bidiagonal matrix (defined above). How do you find the infinity norm of its inverse
One naive solution is to first compute the inverse explicitly, and then take the infinity norm of it. As we’ve hinted above, it takes time to solve a single system, then the system
can be solved as
where is the solution to the system
or, letting denote the solution of the system
and then we would look at the absolute row sum. The bottleneck here is the solves w.r.t the columns of the identity at cost each, so the entire process is expected to take time. This is actually pretty good, because the inverse takes to print out, so finding its maximum row-sum in this amount of time is pretty reasonable. But the world is a cruel place, and somewhere out there, and billion variable dynamic program awaits us that needs to shave off every possible second possible.
Can we do better than time and space?
3. Solution Attempt
Exploring the Inverse
Let’s begin with a reasonable direction: let’s figure out what the inverse of a bidiagonal matrix looks like. We’ll mainly be looking at the case but generalizing for any . Suppose that looks like
Now, if we take the inverse of , we’re not guaranteed that its bidiagonal structure will be preserved, but we know that it will at least be lower triangular. Let’s, for the sake of being punny, denote , then we know that
let’s start writing out the individual equations generated by this system:
Inductively, we can show that
furthermore, the zero equations can be inductively generalized as
Suppose that , then we know that . We also have an equation:
and so on, we can compute every gamma with the recurrence
Great, this gives us a nice algorithm to fill out the entries of the inverse in time, but that doesn’t really help us.
so it doesn’t matter what the sign of is. Now, the s are entirely multiplicative, and hence their magnitudes do not get changed if any of or are changed. As such, we get the following lemma:
Suppose that such that , then
this basically says that we are allowed to arbitrarily flip the signs of the entries of without changing the value of . This follows from the fact that the entries of the inverse, , depend only multiplicatively on the entries of so flipping the signs of the elements of will only flip the signs of , but the magnitudes of each element of will remain the same.
In other words, is invariant under transformations during which the signs of are flipped. This is going to come in very handy, and the above recurrence for is constructive in that it gives an algorithm for which elements of to flip to reduce the problem into an easier space.
Row Summation as a Linear Operation
Now, consider the operation
it seems that multiplying any matrix by the column of ones ( from here on) will return a column of row sums. Great! Shouldn’t it then be the case that we just need to find the maximum column of ? Unfortunately, what we really want is the sum of the absolute values of each row, so if ever contains a negative number, this will break :(
Some of you may already see how to resolve this now; but for the moment, let’s just check whether it’s actually possible to at least do this row-sum operation in linear time! The problem is solving , or equivalently, finding an such that
so it takes time to solve this system. If only we can just do this…
A Final Observation
By now, you might be a little perplexed by lemma , which we took a lot of pain and agony to develop, but never used. Now, the problem of solving this problem by maximizing the column of is that the elements of aren’t guaranteed positive. However, and its construction seems to say that if we use the correct signs in the elements of , then we might be able to manipulate the signs of the elements of without disturbing its final infinity norm. At this point, I had an idea:
can I somehow force all of the entries of the inverse to be positive by flipping the signs of the right elements? If we can, then from we can generate a whose inverse only contains positive numbers, so
which can be done in time! This is quite exciting, but how do we do it?
Recall in the proof of , we used the recurrence for the elements of the inverse as
Now, suppose that for some , we know immediately that , so the diagonals of must be positive. By the strong induction hypothesis, let’s assume that for any , , then for to also be positive, it better be the case that
so must be negative in , which gives a construction of
Taking the above into consideration, the following algorithm can be used to compute the infinity norm of the inverse.
diagonal of L = abs(diag(L)); subdiagonal of L = -abs(diag(L, -1)); x(1) = 1/L(1,1); for k = 2 to n x(k) = (1 - L(k,k-1)*x(k-1))/L(k,k); end return the maximum of x