Notebook 8 – Math 2121, Fall 2020

An $n \times n$ matrix for $n = 1$ is just a number $A = [\begin{matrix} a \end{matrix}]$ .

Such a matrix is invertible if and only if $a \neq 0$ .

It is clear that we can "move around" in the subspace of invertible $1 \times 1$ matrices within $R^{1}$ , in the sense that we can perturb the entries of $A$ slightly and still have an invertible matrix: specifically, $[\begin{matrix} a + δ \end{matrix}]$ is still invertible as long as $| δ |$ is small.

In particular, this is true whenever $| δ | < | a |$ .

However, it's also clear that there are the set of invertible $1 \times 1$ matrices has two connected components: the matrices $A = [\begin{matrix} a \end{matrix}]$ with $a < 0$ and those with $a > 0$ .

If we start with $a < 0$ , then there is no way to gradually perturb $a$ to get a positive value without passing through $a = 0$ , at which point the matrix would be non-invertible. So there is no path that stays completely within the set of invertible matrices connecting, say $A = [\begin{matrix} - 1 \end{matrix}]$ and $A = [\begin{matrix} 1 \end{matrix}]$ .

18.2 μs

xxxxxxxxxx
 
using LinearAlgebra

420 μs

Let's investigate how this generalizes to $2 \times 2$ matrices.

A $2 \times 2$ matrix has four entries. Geometrically, there is no difference between the set of all $2 \times 2$ matrices and the set of vectors $R^{4}$ . We have just split our four numbers into two columns rather than kept them in one.

Suppose $A = [\begin{array}{cc} a & b \\ c & d \end{array}]$ .

This matrix is invertible if and only if the vectors $[\begin{matrix} a \\ c \end{matrix}]$ and $[\begin{matrix} b \\ d \end{matrix}]$ are linearly independent, that is, not scalar multiples of each other. Assume this is the case.

I claim that it is still true that we can "move around" within the subset of invertible $2 \times 2$ matrices, in the sense that the perturbed matrix

$[\begin{array}{cc} a + δ_{1} & b + δ_{2} \\ c + δ_{3} & d + δ_{4} \end{array}]$

is still invertible for any choice of numbers $δ_{i}$ that have sufficiently small absolute value.

Let's justify this with a random example.

9.4 μs

2×2 Array{Float64,2}:
 0.98  -1.89
 1.99  -1.64

xxxxxxxxxx
 
A = rand(-2:0.01:2, 2, 2)

156 ms

ε = 0.5

12.4 s

We need a way to visualize the matrix $A$ .

A simple method is to draw the vectors $[\begin{matrix} a \\ c \end{matrix}]$ and $[\begin{matrix} b \\ d \end{matrix}]$ that are the columns of $A$ :

4.6 μs

121 ms

In we increase the parameter $ε$ above, then two colored regions appear in this graph.

The blue circle surrounds the set of vectors $[\begin{array}{cc} a + δ_{1} \\ c + δ_{3} \end{array}]$ with $\sqrt{δ_{1}^{2} + δ_{3}^{2}} < ε$ .

The red circle surrounds the set of vectors $[\begin{array}{cc} b + δ_{2} \\ d + δ_{4} \end{array}]$ with $\sqrt{δ_{2}^{2} + δ_{4}^{2}} < ε$ .

Choose $ϵ$ to be small enough that the blue and red regions do not overlap. Suppose we have another $2 \times 2$ matrix $B$ that we draw as a pair of vectors in $R^{2}$ . If the endpoints of these vectors are contained in the blue and red circles, then $B$ is guaranteed to invertible. (Why?)

8.3 μs

Given $A$ and $ε$ , how would you figure out if the blue and red regions overlap? It's possible to give an exact mathematical formula for this, using only basic algebra and planar geometry. But this is a little complicated, as you can see if you examine the code below, which implements an overlaps method to compute the answer programatically.

8.2 μs

overlaps (generic function with 1 method)

xxxxxxxxxx
 
begin
    function tangent_points(u1, u2, eps)
        if u1 != 0
            k = -eps^2 + u1^2 + u2^2
            @assert k >= 0
            
            a, b, c = u1^2 + u2^2, -2 * u2 * k, k^2 - k * u1^2
            
            discr = b^2 - 4 * a * c
            if abs(discr) < 10e-8
                discr = 0.0
            end
            
            x2 = (-b + sqrt(discr)) / 2 / a
            x1 = -u2 / u1 * x2 + k / u1
            
            y2 = (-b - sqrt(discr)) / 2 / a
            y1 = -u2 / u1 * y2 + k / u1
            
            return x1, x2, y1, y2
        else
            x1, x2, y1, y2= tangent_points(u2, u1, eps)
            return x2, x1, y2, y1
        end
    end
    
    function angle(u1, u2)
        if u1 == 0
            return u2 > 0 ? pi/2 : 3 * pi/2
        end
        if u2 == 0
            return u1 > 0 ? 0 : pi
        end
        
        a = atan(abs(u2) / abs(u1))
        if u1 > 0 && u2 > 0
            return a
        elseif u1 > 0 && u1 > 0
            return 2 * pi - a
        elseif u1 < 0 && u2 > 0
            return pi - a
        else
            return pi + a
        end
    end
    
    function in_same_relative_position(x, y, test, target)
        a = angle(x[1], x[2])
        b = angle(y[1], y[2])
        if a > b
            a, b = b, a
        end
        
        s = angle(test[1], test[2])
        if b < s
            a, b = b, a + 2 * pi
        elseif s < a 
            a, b, s = b, a + 2 * pi, s + 2 * pi
        end
        @assert a <= s <= b
        
        t = angle(target[1], target[2])
        return a < t < b || a < t + 2 * pi < b
    end
        
    function overlaps(eps, A)
        for i = [1, 2]
        for s = [-1,1]
            u1, u2 = A[1, i], A[2, i]
            v1, v2 = s * A[1, 3 - i], s * A[2, 3- i]
​
            if -eps^2 + u1^2 + u2^2 < 0 || -eps^2 + v1^2 + v2^2 < 0
                return true
            end
                
            x1, x2, y1, y2 = tangent_points(u1, u2, eps)
            p1, p2, q1, q2 = tangent_points(v1, v2, eps)
​
            if in_same_relative_position([x1; x2], [y1; y2], [u1; u2], [p1; p2])
                return true
            end
​
            if in_same_relative_position([x1; x2], [y1; y2], [u1; u2], [q1; q2])
                return true
            end
        end
        end
        return false
    end
end

333 μs

true

xxxxxxxxxx
 
overlaps(epsilon, A)

13 μs

As long as the red and blue regions do not overlap, then $[\begin{array}{cc} a + δ_{1} & b + δ_{2} \\ c + δ_{3} & d + δ_{4} \end{array}]$ is invertible for every choice $δ_{1}, δ_{2}, δ_{3}, δ_{4}$ with $\sqrt{δ_{1}^{2} + δ_{3}^{2}} < ε$ and $\sqrt{δ_{2}^{2} + δ_{4}^{2}} < ε$ .

These conditions involving square roots are satisfied, for example, as long as we have $- \frac{1}{\sqrt{2}} ε < δ_{i} < \frac{1}{\sqrt{2}} ε$ for each $i = 1, 2, 3, 4$ .

This shows that we can 'move' from $A$ to any matrix $B$ , whose columns correspond to points in the red and blue squares, while always staying inside the set of $2 \times 2$ matrices.

The following method computes (approximately) the largest value of $ε$ that ensures that the red and blue regions above do not overlap. I'll call this the radius of $A$ .

13.2 μs

radius (generic function with 2 methods)

xxxxxxxxxx
 
function radius(A, reduce_scale=1.0)
    ans = 1.0 # start with a guess of 1.1
    factor = 1.1 # factor to adjust by at each iteration
    
    iter, limit = 0, 100 # parameters to bound number of iterations
    while true
        over = overlaps(ans, A)
        overup = overlaps(factor * ans, A)
        
        # if current answer works and can't do better, break
        if ! over && (overup || iter == limit)
            break
            
        # if current answer works but can do better, increase value
        elseif ! over && ! overup
            ans *= factor
            
        # if current answer doesn't work, decrease value
        else
            ans /= factor
        end
        
        # quite if already spent too many iterations searching
        iter += 1
        if iter > limit
            return 0.0
        end
    end
    
    # perfect extra size reduction, just in case.
    # these formulas will make more sense after we learn about projections.
    u, v = A[1:2,1], A[1:2,2]
    n1 = norm(v - dot(u, v) / dot(u, u) * u)
    n2 = norm(u - dot(u, v) / dot(v, v) * v)
    ans = minimum([n1 n2 ans])
    
    # optional size reduction
    ans = reduce_scale * ans
    
    # reduce answer to 0.0 if it's very small, to avoid roundoff errors
    ans = ans < 0.001 ? 0.0 : ans 
    
    return ans
end

67 μs

0.4240976183724846

xxxxxxxxxx
 
radius(A)

112 ms

Let's create another random invertible 2-by-2 matrix $B$ .

7.2 μs

Array{Float64,2}1

2×2 Array{Float64,2}:
 0.98  -1.89
 1.99  -1.64

2×2 Array{Float64,2}:
 -1.2    0.16
  1.53  -0.29

xxxxxxxxxx
 
begin
    B = rand(-2:0.01:2, 2, 2)
    [A, B]
end

16.1 μs

18 ms

Is it possible to traverse a path from $A$ to $B$ that stays completely inside the set of invertible 2-by-2 matrices? There is certainly a path from $A$ to $B$ inside the set of all 2-by-2 matrices.

The simplest such path is the line from $A$ to $B$ : this is the sequence of matrices

$A + t (B - A)$

as $t \in R$ varies over the numbers in the interval $0 \leq t \leq 1$ .

13.2 μs

For our matrices, give equally spaces points on this line are given by

7.3 μs

Array{Float64,2}1

2×2 Array{Float64,2}:
 0.98  -1.89
 1.99  -1.64

2×2 Array{Float64,2}:
 0.435  -1.3775
 1.875  -1.3025

2×2 Array{Float64,2}:
 -0.11  -0.865
  1.76  -0.965

2×2 Array{Float64,2}:
 -0.655  -0.3525
  1.645  -0.6275

2×2 Array{Float64,2}:
 -1.2    0.16
  1.53  -0.29

xxxxxxxxxx
 
[A + t * (B-A) for t=0:0.25:1]

48.6 ms

Some questions we could ask: what does this line look like in terms of our pictures of 2-by-2 matrices as pairs of vectors in $R^{2}$ ? And does this line stay in the set of invertible 2-by-2 matrices?

10.4 μs

step! (generic function with 1 method)

xxxxxxxxxx
 
begin
    # This struct represents the line from A to B described above.
    # The value of A is our current position on this line.
    Base.@kwdef mutable struct InvertibleMatrixPath
        A::Array{Float64,2} = A
        B::Array{Float64,2} = B
        
        prev_A::Array{Array{Float64,2},1} = []
        prev_r::Array{Float64,1} = []
    end
    
    # Every time the `step` method is called, we move forward on the line to B.
    # The catch is that we only move the distance that is guaranteed to keep
    # us inside the set of invertible matrices. This means that the distance
    # we travel could become arbitrary small; we could get stuck on the path to B.
    function step!(ch::InvertibleMatrixPath)
        r = radius(ch.A, 0.5)
        
        push!(ch.prev_r, r)
        push!(ch.prev_A, ch.A)
        
        diff = ch.B - ch.A
        r = norm(diff) < r ? 1.0 : r / norm(diff)
        ch.A += diff * r
    end
end

1.6 ms

Using the above struct, we can implement a method that tells us whether the line from A to B does stay inside the set of invertible 2-by-2 matrices.

8.1 μs

line_stays_invertible (generic function with 1 method)

xxxxxxxxxx
 
function line_stays_invertible(A, B)
    line = InvertibleMatrixPath(A, B, [], [])
    
    while true
        step!(line)
        if norm(line.A - B) < 10e-8 # we have reached B successfully
            return true
        end
        if line.prev_r[end] == 0 # we have gotten stuck on our path
            return false
        end
    end
end 

48.4 μs

The following method returns how many steps we take before we know if our line reaches $B$ .

5.8 μs

path_length (generic function with 1 method)

xxxxxxxxxx
 
function path_length(A, B; step_fn=step!)
    path = InvertibleMatrixPath(A, B, [], [])
    step_count = 1
    while true
        step_fn(path)
        if path.prev_r[end] == 0 || norm(path.A - B) < 10e-8
            break
        end
        step_count += 1
    end
    
    len = 0.0
    for i = 1:length(path.prev_A) - 1
        len += norm(path.prev_A[i+1] - path.prev_A[i])
    end
    return len, step_count
end

106 μs

For the matrices $A$ and $B$ :

9.1 μs

Array{Float64,2}1

2×2 Array{Float64,2}:
 0.98  -1.89
 1.99  -1.64

2×2 Array{Float64,2}:
 -1.2    0.16
  1.53  -0.29

xxxxxxxxxx
 
[A, B]

4.8 μs

true

xxxxxxxxxx
 
line_stays_invertible(A, B)

1.2 ms

3.31417

xxxxxxxxxx
 
path_length(A, B)

1.5 ms

The animation below shows the path that results from trying to follow the line from $A$ to $B$ while staying inside the set of invertible matrices.

7.1 μs

animation (generic function with 1 method)

xxxxxxxxxx
 
function animation(A, B; step_fn=step!)
    bound = 1.5 * maximum([maximum(abs.(A)) maximum(abs.(B))])
    path = InvertibleMatrixPath(A, B, [], [])   
    steps = 1.25 * path_length(A, B; step_fn=step_fn)[2]
    anim = @animate for j=1:steps
        e = radius(path.A, 0.5)
        plt = scatter(
            [0], [0], xlim=(-bound,bound), ylim=(-bound,bound),
            legend = false, label = "origin",
            aspect_ratio=:equal,
            title = "Traversing a path from A to B in space of 2-by-2 matrices",
            ylabel = line_stays_invertible(A, B) ? "outcome: line connects" : "outcome: line is obstructed",
            xlabel = string("radius = ", e)
        )
​
        for i=1:length(path.prev_r)
            plot!(circle(path.prev_A[i][1,1], path.prev_A[i][2,1], path.prev_r[i]), opacity=.15, color=[:blue])
            plot!(circle(path.prev_A[i][1,2], path.prev_A[i][2,2], path.prev_r[i]), opacity=.15, color=[:red])
        end
​
        draw(path.A[1,1], path.A[2,1], e, [:blue])
        draw(path.A[1,2], path.A[2,2], e, [:red]) 
​
        quiver!(quiver = ([B[1,1]],[B[2,1]]), [0], [0], color = [:grey],)
        quiver!(quiver = ([B[1,2]],[B[2,2]]), [0], [0], color = [:grey],)
​
        step_fn(path) 
    end every 1
    gif(anim, "anim_fps15.gif", fps = 5)
end

144 μs

xxxxxxxxxx
 
animation(A, B)

1.9 s

Sometimes the line from $A$ to $B$ stays in the set of $2 \times 2$ invertible matrices, sometimes it doesn't.

A region of space that contains the line segment connecting any two of its points is called convex. The solid sphere is convex. The solid torus (donut shape) is not convex. All regular polygons are convex regions of $R^{2}$ . Star shapes are not convex.

Anyways, our experiments show that the set of invertible 2-by-2 matrices is not convex. But we can say something more precise.

9.4 μs

367 ns

We saw in the lecture notes that $A = [\begin{array}{cc} a & b \\ c & d \end{array}]$ is invertible if and only if $a d - b c \neq 0$ .

In a week or so, we'll that the number $a d - b c$ is called the determinant of $A$ , and is written $det A$ .

When we perturb $A$ slightly, the value of $det A$ changes by a small amount. So if $det A$ starts out positive, and we perform a sequence of perturbations that remain in the set of invertible matrices, ending up at some invertible matrix $B$ , then $det B$ must also be positive.

In particular, traveling along our invertible matrix path from $A$ to $B$ cannot change the sign of the determinant. We can only change the sign by passing through a non-invertible matrix.

12.9 μs

It turns out there are exactly two connected components of the set of 2-by-2 invertible matrices: the subset of matrices with negative determinant and the subset of matrices with positive determinant.

There are exactly the same number of such matrices.

If $A = [\begin{array}{cc} a & b \\ c & d \end{array}]$ has positive determinant then $[\begin{array}{cc} b & a \\ d & c \end{array}]$ has negative determinant.

(Because $det A = a d - b c$ .)

Any two 2-by-2 matrices with positive determinant are connected by a path the stays completely inside the set of invertible matrices.

However, these paths cannot always be lines: the subset of 2-by-2 invertible matrices with positive (or negative) determinant is also not convex. We can check this:

11 μs

count_reachable_pairs (generic function with 1 method)

xxxxxxxxxx
 
function count_reachable_pairs(trials)
    total = 0
    i = 0
    while i < trials
        A = rand(-1:0.01:1, 2, 2)
        B = rand(-1:0.01:1, 2, 2)
        
        # restrict search to matrices A and B both with positive determinant
        if A[1,1] * A[2,2] - A[1,2] * A[2,1] < 0
            continue
        end
        if B[1,1] * B[2,2] - B[1,2] * B[2,1] < 0
            continue
        end
        
        if line_stays_invertible(A, B)
            total += 1
        end
        
        i += 1
    end
    return total
end

51.5 μs

xxxxxxxxxx
 
# choose 100 pairs random 2-by-2 matrices (A, B) with positive determint
# for how many pairs does the line from A to B stay in the set of invertible matrices?
count_reachable_pairs(100)

200 ms

find_unreachable (generic function with 1 method)

xxxxxxxxxx
 
begin
    function find_reachable(range)
        while true  
            A = rand(range, 2, 2)
            B = rand(range, 2, 2)
​
            if A[1,1] * A[2,2] - A[1,2] * A[2,1] < 0
                continue
            end
            if B[1,1] * B[2,2] - B[1,2] * B[2,1] < 0
                continue
            end
​
            if line_stays_invertible(A, B)
                return A, B
            end
        end
    end
    
    function find_unreachable(range)
        while true  
            A = rand(range, 2, 2)
            B = rand(range, 2, 2)
​
            if A[1,1] * A[2,2] - A[1,2] * A[2,1] < 0
                continue
            end
            if B[1,1] * B[2,2] - B[1,2] * B[2,1] < 0
                continue
            end
​
            if ! line_stays_invertible(A, B)
                return A, B
            end
        end
    end
end

65.3 μs

Here are two matrices with positive determinant:

7.4 μs

2×2 Array{Float64,2}:
 7.91  3.68
 6.55  9.44

2×2 Array{Float64,2}:
 -4.54   4.07
  6.86  -9.48

xxxxxxxxxx
 
source, target = find_unreachable(-10:0.01:10)

31.9 ms

1×2 Array{Float64,2}:
 50.566399999999994  15.118999999999998

xxxxxxxxxx
 
[det(source) det(target)]

285 ms

However, the line between them does not stay in the set of invertible matrices:

11.6 μs

false

xxxxxxxxxx
 
line_stays_invertible(source, target)

4 ms

5.30719

xxxxxxxxxx
 
path_length(source, target)

4.3 ms

xxxxxxxxxx
 
animation(source, target)

3.6 s

By using a different step function we can nevertheless construct a path from source to target that stays inside the set of invertible matrices.

7.8 μs

nonlinear_step! (generic function with 1 method)

xxxxxxxxxx
 
function nonlinear_step!(ch::InvertibleMatrixPath)
    r = radius(ch.A, 0.5)
    push!(ch.prev_r, r)
    push!(ch.prev_A, ch.A)
    
    a1 = angle(ch.A[1,1], ch.A[2,1])
    a2 = angle(ch.A[1,2], ch.A[2,2])
    b1 = angle(ch.B[1,1], ch.B[2,1])
    b2 = angle(ch.B[1,2], ch.B[2,2])
    
    b1 = b1 < a1 ? b1 + 2 * pi : b1
    b2 = b2 < a2 ? b2 + 2 * pi : b2
    
    d1 = minimum([b1 - a1 r]) / norm(ch.A[1:2,1])
    d2 = minimum([b2 - a2 r]) / norm(ch.A[1:2,2])
    
    if abs(d1) > 10e-2 && abs(d2) > 10e-2
        d1, d2 = minimum([d1 d2]), minimum([d1 d2])
    end
    
    if abs(d1) + abs(d2) > 10e-4
        col1 = [cos(d1) -sin(d1); sin(d1) cos(d1)] * ch.A[1:2,1]
        col2 = [cos(d2) -sin(d2); sin(d2) cos(d2)] * ch.A[1:2,2]
        ch.A = hcat(col1, col2)
    else
        diff = ch.B - ch.A
        r = norm(diff) < r ? 1.0 : r / norm(diff)
        ch.A += diff * r
    end
end

131 μs

45.1768

xxxxxxxxxx
 
p, q = path_length(source, target; step_fn=nonlinear_step!)

248 ms

xxxxxxxxxx
 
if q <= 100
    animation(source, target; step_fn=nonlinear_step!)
end

15.3 s

The shortest path from $A$ to $B$ that stays inside the set of invertible matrices is called a geodesic.

If the line from $A$ to $B$ always stays invertible, then it is the unique geodesic.

When this line does not stay invertible, how could we construct a geodesic from $A$ to $B$ ? How could we derive a formula for its length in terms of $A$ and $B$ ?

20 μs