Add Me!Close Menu Navigation

Infinity Norm of the Inverse of Lower Bidiagonal Matrices

We consider the problem of efficiently computing the infinity norm of the inverse of a lower bidiagonal matrix L in linear time.

1. Introduction

The infinity norm \|\cdot \|_\infty of a matrix is just the absolute maximum row sum of that matrix, or alternatively

    \[\|A\|_\infty = \max_i \sum_j |a_{ij}|\]

so computing the norm shouldn’t be all that difficult. For a full-matrix A\in \mathbb{R}^{n\times n}, it should take approximately O(n) time to compute the sum of a single row, and n such computations to get the maximum row sum for a total of O(n^2) time.

If A is also lower bidiagonal, then it would have the shape

    \[\mat{x \\ x & x \\ & x & x \\ && x & x}\]

that is, the diagonal and the first lower-subdiagonal are filled, but everywhere else are zero. Being bidiagonal grants the matrix a lot of nice properties; for one thing, it only takes 2 operations to get the row sum now, so it only takes O(n) time to find the infinity norm of a bidiagonal matrix. Similarly, as we shall see later, it also only takes O(n) time to solve a system of linear equation of the form

    \[Ax = b\]

if A is bidiagonal.

2. The Problem

Suppose you’re given a lower-bidiagonal matrix L (defined above). How do you find the infinity norm of its inverse

    \[\|L^{-1}\|_\infty\]

One naive solution is to first compute the inverse explicitly, and then take the infinity norm of it. As we’ve hinted above, it takes O(n) time to solve a single system, then the system

    \[LL^{-1} = \mat{1 \\ & 1 \\ && 1 \\ &&& 1}\]

can be solved as

    \[L^{-1} = \mat{x_1 & x_2 & x_3 & x_4}\]

where x_i is the solution to the system

    \[Lx_i = e_i ~~~~ e_i \mbox{ is the ith column of the identity matrix}\]

or, letting L\backslash b denote the solution of the system Lx = b

    \[L^{-1} = \mat{L\backslash \mat{1\\0\\0\\0} & L \backslash \mat{0\\1\\0\\0} & L \backslash \mat{0\\0\\1\\0} & L \backslash \mat{0\\0\\0\\1}}\]

and then we would look at the absolute row sum. The bottleneck here is the n solves w.r.t the columns of the identity at O(n) cost each, so the entire process is expected to take O(n^2) time. This is actually pretty good, because the inverse takes O(n^2) to print out, so finding its maximum row-sum in this amount of time is pretty reasonable. But the world is a cruel place, and somewhere out there, and billion variable dynamic program awaits us that needs to shave off every possible second possible.

Can we do better than O(n^2) time and space?

3. Solution Attempt

Exploring the Inverse

Let’s begin with a reasonable direction: let’s figure out what the inverse of a bidiagonal matrix looks like. We’ll mainly be looking at the n = 4 case but generalizing for any n. Suppose that L looks like

    \[\mat{l_1 \\ r_2 & l_2 \\ & r_3 & l_3 \\ && r_4 & l_4}\]

Now, if we take the inverse of L, we’re not guaranteed that its bidiagonal structure will be preserved, but we know that it will at least be lower triangular. Let’s, for the sake of being punny, denote \Gamma = L^{-1}, then we know that

    \[\mat{l_1 \\ r_2 & l_2 \\ & r_3 & l_3 \\ && r_4 & l_4} \mat{\gamma_{11} \\\gamma_{21} & \gamma_{22} \\ \gamma_{31} &\gamma_{32} &\gamma_{33} \\ \gamma_{41}&\gamma_{42}&\gamma_{43}&\gamma_{44}} = \mat{1 \\ &1 \\ &&1 \\ &&& 1}\]

let’s start writing out the individual equations generated by this system:

    \begin{align*} l_1\gamma_{11} &= 1 \\ r_2\gamma_{11} + l_2\gamma{21} &= 0 \\ l_2\gamma_{22} &= 0 \\ r_3\gamma_{21} + l_3\gamma_{31} &= 0 \\ r_3\gamma_{22} + l_3\gamma_{32} &= 0 \\ l_3\gamma_{33} &= 1 \\ r_4\gamma_{31} + l_4\gamma_{41} &= 0 \\ r_4\gamma_{32} + l_4\gamma_{42} &= 0 \\ r_4\gamma_{33} + l_4\gamma_{43} &= 0 \\ l_4\gamma_{44} &= 1 \end{align*}

Inductively, we can show that

    \[\gamma_{kk} &= \frac{1}{l_k}\]

furthermore, the zero equations can be inductively generalized as

    \[r_k\gamma_{k-1,j} + l_k\gamma_{k,j} = 0 ~~~~ \forall j < k.\]

Suppose that k = 4, then we know that \gamma_{44} = \frac{1}{l_{4}}. We also have an equation:

    \[r_4\gamma_{34} + l_4 \gamma_{44} = 0\]

so that

    \[\gamma_{34} = -\frac{l_4}{r_4} \gamma_{44}\]

and so on, we can compute every gamma with the recurrence

    \[\gamma_{kj} = \begin{cases} 0 & j > k \\ l_k^{-1} & j = k \\ -\frac{l_{k+1}}{r_{k+1}} \gamma_{k+1,j} & j < k \end{cases}\]

Great, this gives us a nice algorithm to fill out the entries of the inverse in O(n^2) time, but that doesn’t really help us.

Now,

    \[\|L^{-1}\|_\infty = \|\Gamma\|_\infty = \max_k \sum_{j=1}^k |\gamma_{kj}|\]

so it doesn’t matter what the sign of \gamma_{kj} is. Now, the \gammas are entirely multiplicative, and hence their magnitudes do not get changed if any of l_k or r_k are changed. As such, we get the following lemma:

(3.1.1): Invariance

Suppose that \hat L \in \mathbb{R}^{n\times n} such that |L| = |\hat{L}|, then

    \[\|\hat L^{-1}\|_\infty = \|L^{-1}\|_\infty\]

this basically says that we are allowed to arbitrarily flip the signs of the entries of L without changing the value of \|L^{-1}\|_\infty. This follows from the fact that the entries of the inverse, \Gamma, depend only multiplicatively on the entries of L so flipping the signs of the elements of L will only flip the signs of \Gamma, but the magnitudes of each element of \Gamma will remain the same.

In other words, \|L^{-1}\|_\infty is invariant under transformations during which the signs of L are flipped. This is going to come in very handy, and the above recurrence for \gamma_{kj} is constructive in that it gives an algorithm for which elements of L to flip to reduce the problem into an easier space.

Row Summation as a Linear Operation

Now, consider the operation

    \[A\mat{1\\1\\1\\1} = \mat{\sum_j a_{1j} \\ \sum_j a_{2j} \\ \sum_j a_{3j} \\ \sum_j a_{4j}}\]

it seems that multiplying any matrix by the column of ones (e from here on) will return a column of row sums. Great! Shouldn’t it then be the case that we just need to find the maximum column of L^{-1} e? Unfortunately, what we really want is the sum of the absolute values of each row, so if L^{-1} ever contains a negative number, this will break :(

Some of you may already see how to resolve this now; but for the moment, let’s just check whether it’s actually possible to at least do this row-sum operation in linear time! The problem is solving L^{-1}e, or equivalently, finding an x such that

    \[\mat{l_1 \\ r_2 & l_2 \\ & r_3 & l_3 \\ && r_4 & l_4} \mat{x_1\\x_2\\x_3\\x_4} = \mat{1\\1\\1\\1}\]

here

    \begin{align*} x_1 &= l_1^{-1} \\ x_2 &= \frac{1 - r_2x_1}{l_2} \\ x_3 &= \frac{1 - r_3x_2}{l_3} \\ \vdots \\ x_k &= \frac{1 - r_kx_{k-1}}{l_k} \end{align*}

so it takes O(n) time to solve this system. If only we can just do this…

A Final Observation

By now, you might be a little perplexed by lemma (3.1.1), which we took a lot of pain and agony to develop, but never used. Now, the problem of solving this problem by maximizing the column of L^{-1}e is that the elements of L^{-1} aren’t guaranteed positive. However, (3.1.1) and its construction seems to say that if we use the correct signs in the elements of L, then we might be able to manipulate the signs of the elements of L^{-1} without disturbing its final infinity norm. At this point, I had an idea:
can I somehow force all of the entries of the inverse to be positive by flipping the signs of the right elements? If we can, then from L we can generate a \hat L whose inverse \hat L^{-1} only contains positive numbers, so

    \[\|L^{-1}\|_\infty = \| \hat{L}^{-1}\|_\infty = \hat{L}^{-1} e\]

which can be done in O(n) time! This is quite exciting, but how do we do it?

Recall in the proof of (3.1.1), we used the recurrence for the elements of the inverse \gamma_{kj} as

    \[\gamma_{kj} = \begin{cases} 0 & j > k \\ l_k^{-1} & j = k \\ -\frac{l_{k+1}}{r_{k+1}} \gamma_{k+1,j} & j < k \end{cases}\]

Now, suppose that for some k, we know immediately that \gamma_{kk} \ge 0 =>l_k^{-1} \ge 0, so the diagonals of \hat L must be positive. By the strong induction hypothesis, let’s assume that for any j < k, \gamma_{k+1,j} \ge 0, then for \gamma_{kj} to also be positive, it better be the case that

    \[\gamma_{kj} = -\frac{\overbrace{l_{k+1}}^{\ge 0}}{r_{k+1}} \overbrace{\gamma_{k+1,j}}^{\ge 0} \ge 0\]

so

    \[-\frac{+}{?} + = + => ? \mbox{ must be } -\]

so r_k must be negative in \hat L, which gives a construction of \hat L

4. Solution!

Taking the above into consideration, the following algorithm can be used to compute the infinity norm of the inverse.

diagonal of L = abs(diag(L));
subdiagonal of L = -abs(diag(L, -1));
x(1) = 1/L(1,1);
for k = 2 to n
    x(k) = (1 - L(k,k-1)*x(k-1))/L(k,k);
end
return the maximum of x
Posted By Lee

Woosh

  • flood

    Similarly, as we shall see later, it also only takes time to solve a system of linear equation of the form

    eh… isn’t that like almost trivial?

    • http://phailed.me/ Phailure

      the O(n) time part?

      • flod

        yea the O(n) part.

        academia is just academia, but stanford is pretty crappy overall
        I don’t have facebook :D

        • http://phailed.me/ Phailure

          Yeah I guess it is pretty obvious if you’re used to that stuff :P someone once told me I came off as a pompous ass because I had used the word trivial in my powerpoint once -.-

          Also I’m gonna work at Facebook next year, so you should get Facebook :P

    • http://phailed.me/ Phailure

      also, long time no see buddy; how’s academia treating ya? Also, give me your Facebook! <.<

Recent Comments