Notebook 7 – Math 2121, Fall 2020
Today's notebook will investigate (linear) vector valued functions, the operations we can perform on them, and their standard matrices. Then we'll also talk about how to use linear algebra to analyze a random walk.
Running this notebook (optional)
If you have Pluto up and running, you can access the notebook we are currently viewing by entering this link (right click -> Copy Link) in the Open from file menu in Pluto.
x
md"# Notebook 7 -- Math 2121, Fall 2020
Today's notebook will investigate (linear) vector valued functions, the operations we can perform on them, and their standard matrices. Then we'll also talk about how to use linear algebra to analyze a random walk.
### Running *this* notebook (optional)
If you have Pluto up and running, you can access the notebook we are currently viewing by entering [this link](http://www.math.ust.hk/~emarberg/teaching/2020/Math2121/julia/07_Math2121_Fall2020.jl) (right click -> Copy Link) in the *Open from file* menu in Pluto.
"
Vector valued functions
The following code creates a struct
to represent functions
xxxxxxxxxx
md"### Vector valued functions
xxxxxxxxxx
struct VectorFunction
formula
domain::Int64
codomain::Int64
end
We can evaluate a VectorFunction
using this method:
xxxxxxxxxx
md"We can evaluate a `VectorFunction` using this method:"
evaluate (generic function with 1 method)
xxxxxxxxxx
function evaluate(f::VectorFunction, x)
return f.formula(x)
end
#1
1
1
xxxxxxxxxx
# function R^1 -> R^1
some_function = VectorFunction(x -> x^2, 1, 1)
10000
xxxxxxxxxx
evaluate(some_function, 100)
Operations on vector valued functions
As usual, we can add, subtract, rescale, and compose vector valued functions.
xxxxxxxxxx
md"### Operations on vector valued functions
add_vector_functions (generic function with 1 method)
xxxxxxxxxx
function add_vector_functions(f::VectorFunction, g::VectorFunction)
f.domain == g.domain
f.codomain == g.codomain
return VectorFunction(x -> evaluate(f, x) + evaluate(g, x), f.domain, g.domain)
end
subtract_vector_functions (generic function with 1 method)
xxxxxxxxxx
function subtract_vector_functions(f::VectorFunction, g::VectorFunction)
f.domain == g.domain
f.codomain == g.codomain
return VectorFunction(x -> evaluate(f, x) - evaluate(g, x), f.domain, g.domain)
end
scale_vector_functions (generic function with 1 method)
xxxxxxxxxx
function scale_vector_functions(c, f::VectorFunction)
return VectorFunction(x -> c * evaluate(f, x), f.domain, f.domain)
end
compose_vector_functions (generic function with 1 method)
xxxxxxxxxx
function compose_vector_functions(f::VectorFunction, g::VectorFunction)
f.domain == g.codomain
return VectorFunction(x -> evaluate(f, evaluate(g, x)), g.domain, f.codomain)
end
#9
2
2
xxxxxxxxxx
begin
# functions R^2 -> R^2
fn_1 = VectorFunction(x -> -x, 2, 2)
fn_2 = VectorFunction(x -> x + [1; 1], 2, 2)
# fn_1 o fn_2
fn_3 = compose_vector_functions(fn_1, fn_2)
end
-4
-4
xxxxxxxxxx
evaluate(fn_1, [4; 4])
5
5
xxxxxxxxxx
evaluate(fn_2, [4; 4])
-1
-1
xxxxxxxxxx
evaluate(fn_3, [0; 0]) # not linear
Below are some methods to test linearity and equality for our VectorFunctions
.
xxxxxxxxxx
md"Below are some methods to test linearity and equality for our `VectorFunctions`."
xxxxxxxxxx
using LinearAlgebra
is_linear (generic function with 3 methods)
xxxxxxxxxx
function is_linear(f::VectorFunction, tolerance=10e-12, trials=1000)
# this function does not prove that f is linear, but if it returns true
# there f is linear with a high probability
for i=1:trials
u = rand(-10:10, f.domain, 1)
v = rand(-10:10, f.domain, 1)
c = rand(Float64) - 0.5
# check empirically that f(u + v) == f(u) + f(v)
w = evaluate(f, u) + evaluate(f, v) - evaluate(f, u + v)
if norm(w) > tolerance
return false
end
# check empirically that f(cu) == c f(u)
w = c * evaluate(f, u) - evaluate(f, c * u)
if norm(w) > tolerance
return false
end
end
return true
end
is_zero (generic function with 3 methods)
xxxxxxxxxx
function is_zero(f::VectorFunction, tolerance=10e-12, trials=1000)
# this function does not prove that f is zero, but if it returns true
# there f is the zero function with a high probability
for i=1:trials
v = rand(-10:10, f.domain, 1)
if norm(evaluate(f, v)) > tolerance
return false
end
end
return true
end
are_equal (generic function with 3 methods)
xxxxxxxxxx
function are_equal(f::VectorFunction, g::VectorFunction, tolerance=10e-12, trials=1000)
return is_zero(subtract_vector_functions(f, g), tolerance, trials)
end
Let's test these methods. The following creates the vector valued function with the formula
This is nonlinear.
xxxxxxxxxx
md"Let's test these methods. The following creates the vector valued function with the formula
nonlinear_function (generic function with 1 method)
xxxxxxxxxx
function nonlinear_function(;domain::Int64, codomain::Int64)
function formula(v)
w = copy(v)
w[1,:] = w[1,:] .* w[2,:]
return w
end
return VectorFunction(formula, domain, codomain)
end
formula
3
3
xxxxxxxxxx
nonlinear = nonlinear_function(domain=3, codomain=3)
20
10
3
xxxxxxxxxx
evaluate(nonlinear, [2;10;3])
false
xxxxxxxxxx
is_linear(nonlinear)
Row operations are linear transformations
We can implement our three types of elementary row operations as linear transformations.
xxxxxxxxxx
md"### Row operations are linear transformations
swap_rows (generic function with 1 method)
xxxxxxxxxx
function swap_rows(i, j; domain)
function func(v)
w = copy(v)
w[i,:], w[j,:] = w[j,:], w[i,:]
return w
end
return VectorFunction(func, domain, domain)
end
add_rows (generic function with 1 method)
xxxxxxxxxx
function add_rows(src, dest; scalar=1, domain)
function func(v)
w = copy(v)
w[dest,:] = scalar * w[src,:] + w[dest,:]
return w
end
return VectorFunction(func, domain, domain)
end
scale_rows (generic function with 1 method)
xxxxxxxxxx
function scale_rows(src, scalar; domain)
scalar != 0
function func(v)
w = float(v)
w[src,:] = w[src,:] * scalar
return w
end
return VectorFunction(func, domain, domain)
end
func
4
4
xxxxxxxxxx
begin
# functions R^4 -> R^4 (that's what domain=4 specifies)
swap = swap_rows(3, 2, domain=4) # swaps rows 3 and 2
addop = add_rows(4, 2, scalar=4, domain=4) # adds 4 times row 4 to row 2
scaleop = scale_rows(3, 2.4, domain=4) # multiplies row 3 by 2.4
end
1
10
100
1000
xxxxxxxxxx
vec = [1;10;100;1000]
1
100
10
1000
xxxxxxxxxx
evaluate(swap, vec)
1
4010
100
1000
xxxxxxxxxx
evaluate(addop, vec)
1.0
10.0
240.0
1000.0
xxxxxxxxxx
evaluate(scaleop, vec)
true
xxxxxxxxxx
is_linear(swap)
true
xxxxxxxxxx
is_linear(addop)
true
xxxxxxxxxx
is_linear(scaleop)
For any linear vector valued function, there is a unique standard matrix, which we can compute with the following methods.
xxxxxxxxxx
md"For any linear vector valued function, there is a unique standard matrix, which we can compute with the following methods."
xxxxxxxxxx
using SparseArrays
elementary_vector (generic function with 1 method)
xxxxxxxxxx
function elementary_vector(n, i)
ans = sparse(zeros(n, 1))
ans[i, 1] = 1
return ans
end
4×1 Array{Float64,2}:
0.0
0.0
0.0
1.0
xxxxxxxxxx
Matrix(elementary_vector(4, 4))
standard_matrix (generic function with 1 method)
xxxxxxxxxx
function standard_matrix(f::VectorFunction)
elem = i -> Matrix(elementary_vector(f.domain, i))
return hcat([evaluate(f, elem(i)) for i=1:f.domain]...)
end
5×5 Array{Float64,2}:
1.0 0.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0 0.0
0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0
xxxxxxxxxx
# should give identity 5-by-5 matrix
standard_matrix(VectorFunction(x -> x, 5, 5))
4×4 Array{Float64,2}:
0.0 0.0 0.0 1.0
0.0 1.0 0.0 0.0
0.0 0.0 1.0 0.0
1.0 0.0 0.0 0.0
xxxxxxxxxx
SM1 = standard_matrix(swap_rows(1, 4, domain=4))
4×4 Array{Float64,2}:
1.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0
0.0 0.0 1.0 0.0
0.0 444.0 0.0 1.0
xxxxxxxxxx
SM2 = standard_matrix(add_rows(2, 4, scalar=444, domain=4))
4×4 Array{Float64,2}:
1.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0
0.0 0.0 2.4 0.0
0.0 0.0 0.0 1.0
xxxxxxxxxx
SM3 = standard_matrix(scaleop)
true
x
standard_matrix(compose_vector_functions(swap_rows(1, 4, domain=4), add_rows(2, 4, scalar=444, domain=4))) == SM1 * SM2
true
x
standard_matrix(add_vector_functions(swap_rows(1, 4, domain=4), add_rows(2, 4, scalar=444, domain=4))) == SM1 + SM2
true
x
standard_matrix(subtract_vector_functions(swap_rows(1, 4, domain=4), add_rows(2, 4, scalar=444, domain=4))) == SM1 - SM2
RREF(A) is a linear function
Because row operations are linear transformations, we can can think of the row reduction algorithm as returning, instead of a matrix in reduced echelon form, the linear transformation formed by composing all the row operations we did to change our starting matrix to echelon form.
The slightly modifed code below reimplements our RREF
method to return this vector valued function.
xxxxxxxxxx
md"### RREF(A) is a linear function
Some helper methods:
xxxxxxxxxx
md"Some helper methods:"
rowop_swap (generic function with 1 method)
xxxxxxxxxx
begin
# Returns true if row has all zeros.
function is_zero_row(A, row, tolerance=10e-16)
(m, n) = size(A)
return all([abs(A[row,col]) < tolerance for col=1:n])
end
# Returns first index of nonzero entry in row, or 0 if none exists.
function find_leading(A, row=1, tolerance=10e-16)
(m, n) = size(A)
for col=1:n
if abs(A[row, col]) > tolerance
return col
end
end
return 0
end
function find_nonzeros_in_column(A, row_start, col, tol=10e-12)
(m, n) = size(A)
return [i for i=row_start:m if abs(A[i,col]) > tol]
end
function columns(A)
(m, n) = size(A)
return n
end
function rowop_replace(A, source_row, target_row, scalar_factor, op)
source_row != target_row
op = compose_vector_functions(
add_rows(source_row, target_row, scalar=scalar_factor, domain=op.domain), op
)
for j=1:columns(A)
A[target_row,j] += A[source_row,j] * scalar_factor
end
return A, op
end
function rowop_scale(A, target_row, scalar_factor, op)
scalar_factor != 0
op = compose_vector_functions(
scale_rows(target_row, scalar_factor, domain=op.domain), op
)
for j=1:columns(A)
A[target_row,j] *= scalar_factor
end
A, op
end
function rowop_swap(A, s, t)
op = compose_vector_functions(
swap_rows(s, t, domain=op.domain), op
)
for j=1:columns(A)
A[t,j], A[s,j] = A[s,j], A[t,j]
end
A, op
end
end
Adjusted RREF algorithm:
xxxxxxxxxx
md"Adjusted RREF algorithm:"
RREF (generic function with 2 methods)
xxxxxxxxxx
begin
function RREF_with_operator(A, tol=10e-12)
A = float(copy(A))
(m, n) = size(A)
op = VectorFunction(x -> x, m, m)
row = 1
for col=1:n
nonzeros = find_nonzeros_in_column(A, row, col, tol)
if length(nonzeros) == 0
continue
end
i = nonzeros[1]
if abs(A[i, col] - 1) < tol
A[i, col] = 1
else
A, op = rowop_scale(A, i, 1 / A[i, col], op)
end
for j=nonzeros[2:end]
A, op = rowop_replace(A, i, j, -A[j, col], op)
end
if i != row
A, op = rowop_swap(A, row, i, op)
end
row += 1
end
for row=m:-1:1
l = find_leading(A, row, tol)
if l > 0
for k=1:row - 1
if abs(A[k,l]) > tol
A, op = rowop_replace(A, row, k, -A[k,l], op)
end
end
end
end
return A, op
end
function RREF_operator(A, tol=10e-12)
A, op = RREF_with_operator(A, tol)
return op
end
function RREF(A, tol=10e-12)
A, op = RREF_with_operator(A, tol)
return A
end
end
5×7 Array{Int64,2}:
1 1 2 1 0 2 -1
2 0 -1 -1 0 -1 1
1 0 -1 1 -1 0 2
0 0 0 1 2 -1 0
0 1 2 2 1 2 0
xxxxxxxxxx
M = rand(-1:2, 5,7)
5×7 Array{Float64,2}:
1.0 0.0 0.0 0.0 0.0 -0.5 -1.0
-0.0 1.0 0.0 0.0 -0.0 2.5 6.0
0.0 0.0 1.0 0.0 0.0 0.0 -3.0
0.0 0.0 0.0 1.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 1.0 -0.5 0.0
xxxxxxxxxx
RREF(M)
#9
5
5
xxxxxxxxxx
rref_op_for_M = RREF_operator(M)
true
xxxxxxxxxx
is_linear(rref_op_for_M)
true
xxxxxxxxxx
evaluate(rref_op_for_M, M) == RREF(M)
5×5 Array{Float64,2}:
1.25 -0.25 0.25 0.75 -1.25
-4.75 2.75 -0.75 -3.25 5.75
2.0 -1.0 0.0 1.0 -2.0
0.5 -0.5 0.5 0.5 -0.5
-0.25 0.25 -0.25 0.25 0.25
xxxxxxxxxx
standard_matrix(rref_op_for_M)
Random walks with matrix multiplication
The last part of today's notebook tries to motivate the utility of matrix multiplication and some of the things we'll do later in the course.
The idea is to analyze a simple random walk.
Suppose you start with
Let's suppose the probability of
You could think of this as a gambling process, or some kind of competition. Maybe you're playing a sport and your chance of winning the next match depends on your current streak of wins or losses.
We can graph your current state (the amount of money you have) as a function of time.
xxxxxxxxxx
md"### Random walks with matrix multiplication
20
xxxxxxxxxx
starting_amount = 20
39
xxxxxxxxxx
maximum_amount = 2 * starting_amount - 1
We need to assign the upstep probabilities
The simplest choice is just to set
xxxxxxxxxx
md"We need to assign the upstep probabilities $p(x)$.
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
xxxxxxxxxx
begin
mid = starting_amount
n = maximum_amount
function power(x, exp)
return x < 0 ? -(abs(x)^exp) : abs(x)^exp
end
function compute_prob(i)
x = (i - mid) / mid
delta = 0.0 # -0.5 * power(x, 2)
noise = 0.0 # 0.5 * (rand(Float64) - 0.5)
p = 0.5 + delta + noise
return maximum([0.00, minimum([1.0, p])])
end
start = elementary_vector(n, mid)
probs = [compute_prob(i) for i=1:n]
end
xxxxxxxxxx
begin
Below is an animation of our random walk after some number of steps.
You can examine the code to see how this is implemented.
xxxxxxxxxx
md"Below is an animation of our random walk after some number of steps.
100
xxxxxxxxxx
steps = 100
xxxxxxxxxx
begin
function transform(probs)
n = length(probs)
A = sparse(zeros(n, n))
for i=1:n
j = rand(Float64) < probs[i] ? i + 1 : i - 1
if 1 <= j <= n
A[j,i] = 1
end
end
return A
end
function nz_position(sparse_matrix, j=1)
I, V = findnz(sparse_matrix[1:end, j])
return length(I) > 0 ? I[1] : 0
end
Base. mutable struct RandomWalk
x::Int64 = mid
end
function step!(w::RandomWalk)
if w.x != 0 && w.x != n + 1
T = transform(probs)
curr_vector = elementary_vector(n, w.x)
next_vector = T * curr_vector
x = nz_position(next_vector)
w.x = (w.x == n && x == 0) ? n + 1 : x
end
end
plt = plot(
1,
xlim = (1, steps),
ylim = (0, n + 1),
title = "Random walk",
marker = 1,
legend = false,
markerstrokewidth = 0,
)
walk = RandomWalk()
for j=1:steps
push!(plt, 1, walk.x)
step!(walk)
end every 1
# plot(plt)
end
An essential question to ask is: after 100 steps, how likely are we to have a given amount of money?
We can approximate the answer by simulating our random walk for 100 steps some large number of trials and making a histogram of the possible outcomes.
xxxxxxxxxx
md"An essential question to ask is: after $(steps) steps, how likely are we to have a given amount of money?
10000
xxxxxxxxxx
trials = 10000
0.0417
0.0
0.0085
0.0
0.0177
0.0
0.0289
0.0
0.0384
0.0
0.047
0.0
0.0562
0.0
0.0648
0.0
0.0766
0.0
0.0812
0.0
0.076
0.0
0.0794
0.0
0.0711
0.0
0.0694
0.0
0.056
0.0
0.0497
0.0
0.0313
0.0
0.0292
0.0
0.0186
0.0
0.0108
0.0
0.0475
xxxxxxxxxx
begin
empirical_distribution = [0 for j=1:n + 2]
for t=1:trials
test = RandomWalk()
for j=1:steps
step!(test)
end
empirical_distribution[test.x + 1] += 1
end
empirical_distribution = empirical_distribution / trials
end
xxxxxxxxxx
begin
Note the gaps in the above plot. Every other bar is zero because after 100 steps starting at 20, your current amount of money must be even.
xxxxxxxxxx
if (steps + starting_amount) % 2 == 0
This is time consuming to calculate!
There is a better way.
It turns out that we can compute the exact distribution of states after 100 steps by construction a certain transition matrix T and multiplying this matrix with itself 100 times.
xxxxxxxxxx
md"This is time consuming to calculate!
Here is the transition matrix:
xxxxxxxxxx
md"Here is the transition matrix:"
xxxxxxxxxx
begin
function transition_matrix(probs)
n = length(probs)
A = zeros(n + 2, n + 2)
for i=1:n
A[i, i + 1] = 1 - probs[i]
A[i + 2, i + 1] = probs[i]
end
A[1, 1] = 1
A[n + 2, n + 2] = 1
return A
end
T = transition_matrix(probs)
spy(T)
end
Here is T^100:
xxxxxxxxxx
md"Here is T^$(steps):"
41×41 Array{Float64,2}:
1.0 0.9204107626128211 0.8423820985077439 … 4.6341381160270354e-5 0.0
0.0 0.001560573282101458 0.0 1.5769965692431847e-5 0.0
0.0 0.0 0.006062226980470583 0.0 0.0
0.0 0.004501653698369124 0.0 5.4648993121546875e-5 0.0
0.0 0.0 0.011438164114091454 0.0 0.0
0.0 0.006936510415722331 0.0 … 0.00011722351975012556 0.0
0.0 0.0 0.015568612266375709 0.0 0.0
⋮ ⋱ ⋮
0.0 0.00011722351975012555 0.0 … 0.00693651041572233 0.0
0.0 0.0 0.00017187251287167243 0.0 0.0
0.0 5.4648993121546875e-5 0.0 0.004501653698369124 0.0
0.0 0.0 7.041895881397872e-5 0.0 0.0
0.0 1.576996569243185e-5 0.0 0.0015605732821014585 0.0
0.0 4.634138116027036e-5 0.00010845272801297256 … 0.9204107626128213 1.0
xxxxxxxxxx
T^steps
The theoretical distribution of states is computed by multiplying T^100 by this starting vector:
xxxxxxxxxx
md"The theoretical distribution of states is computed by multiplying T^$(steps) by this starting vector:"
41×1 Array{Float64,2}:
0.04604406623625023
0.0
0.008758339460287078
0.0
0.017819383810537857
0.0
0.027370695429502278
⋮
0.0
0.017819383810537864
0.0
0.008758339460287074
0.0
0.04604406623625023
xxxxxxxxxx
begin
start_with_boundary = vcat([0], start, [0])
theoretical_distribution = T^steps * start_with_boundary
end
The graph below should match closely with empirical distribution we calculated, if the number of trials is large enough.
xxxxxxxxxx
md"The graph below should match closely with empirical distribution we calculated, if the number of trials is large enough."
xxxxxxxxxx
begin
One benefit of the theoretical approach is we can actually figure out the limiting behavior of our random walk as the number of steps grows to infinity. This is not feasible to compute empirically.
xxxxxxxxxx
md"One benefit of the theoretical approach is we can actually figure out the limiting behavior of our random walk as the number of steps grows to infinity. This is not feasible to compute empirically."
xxxxxxxxxx
begin
tsteps= 100000000 * steps
ttheoretical_distribution = T^tsteps * start_with_boundary
bar(ttheoretical_distribution, color=:green, legend=false, title="Theoretical distribution of states after $(tsteps) steps")
end
In the constant probability 1/2 case, this last graph shows that if you continue long enough then you have a 1/2 chance of losing all your money and a 1/2 chance of making it to 40. With sufficient time, every other possible outcome becomes vanishingly unlikely.
xxxxxxxxxx
md"In the constant probability 1/2 case, this last graph shows that if you continue long enough then you have a 1/2 chance of losing all your money and a 1/2 chance of making it to $(maximum_amount + 1). With sufficient time, every other possible outcome becomes vanishingly unlikely."
Other choices of probabilities may lead to different long term outcomes.
xxxxxxxxxx
md"Other choices of probabilities may lead to different long term outcomes."