Notebook 7 – Math 2121, Fall 2020

Today's notebook will investigate (linear) vector valued functions, the operations we can perform on them, and their standard matrices. Then we'll also talk about how to use linear algebra to analyze a random walk.

Running this notebook (optional)

If you have Pluto up and running, you can access the notebook we are currently viewing by entering this link (right click -> Copy Link) in the Open from file menu in Pluto.

16.3 μs

Vector valued functions

The following code creates a struct to represent functions RnRm.

37.9 μs
408 μs

We can evaluate a VectorFunction using this method:

6.6 μs
evaluate (generic function with 1 method)
18.4 μs
some_function
3.6 ms
10000
6.2 ms

Operations on vector valued functions

As usual, we can add, subtract, rescale, and compose vector valued functions.

4 μs
add_vector_functions (generic function with 1 method)
50.6 μs
subtract_vector_functions (generic function with 1 method)
49.9 μs
scale_vector_functions (generic function with 1 method)
41.8 μs
compose_vector_functions (generic function with 1 method)
53.8 μs
6.7 ms
38.7 ms
80.9 ms
3.8 ms

Below are some methods to test linearity and equality for our VectorFunctions.

2.5 μs
3.4 ms
is_linear (generic function with 3 methods)
54.3 μs
is_zero (generic function with 3 methods)
47.8 μs
are_equal (generic function with 3 methods)
38.1 μs

Let's test these methods. The following creates the vector valued function with the formula

[v1v2vn][v1v2v2vn].

This is nonlinear.

3 μs
nonlinear_function (generic function with 1 method)
74.8 μs
nonlinear
9.9 ms
112 ms
false
581 ms

Row operations are linear transformations

We can implement our three types of elementary row operations as linear transformations.

6.5 μs
swap_rows (generic function with 1 method)
67.1 μs
add_rows (generic function with 1 method)
72.4 μs
scale_rows (generic function with 1 method)
69.7 μs
26.1 ms
vec
14.2 ms
7.9 ms
49.1 ms
158 ms
true
266 ms
true
126 ms
true
251 ms

For any linear vector valued function, there is a unique standard matrix, which we can compute with the following methods.

2.6 μs
2 ms
elementary_vector (generic function with 1 method)
22.5 μs
4×1 Array{Float64,2}:
 0.0
 0.0
 0.0
 1.0
209 ms
standard_matrix (generic function with 1 method)
74.6 μs
5×5 Array{Float64,2}:
 1.0  0.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0  0.0
 0.0  0.0  1.0  0.0  0.0
 0.0  0.0  0.0  1.0  0.0
 0.0  0.0  0.0  0.0  1.0
151 ms
SM1
4×4 Array{Float64,2}:
 0.0  0.0  0.0  1.0
 0.0  1.0  0.0  0.0
 0.0  0.0  1.0  0.0
 1.0  0.0  0.0  0.0
37.1 ms
SM2
4×4 Array{Float64,2}:
 1.0    0.0  0.0  0.0
 0.0    1.0  0.0  0.0
 0.0    0.0  1.0  0.0
 0.0  444.0  0.0  1.0
33.4 μs
SM3
4×4 Array{Float64,2}:
 1.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0
 0.0  0.0  2.4  0.0
 0.0  0.0  0.0  1.0
20.3 μs
true
906 ms
true
11.4 ms
true
10.2 ms

RREF(A) is a linear function

Because row operations are linear transformations, we can can think of the row reduction algorithm as returning, instead of a matrix in reduced echelon form, the linear transformation formed by composing all the row operations we did to change our starting matrix to echelon form.

The slightly modifed code below reimplements our RREF method to return this vector valued function.

6.4 μs

Some helper methods:

6.2 μs
rowop_swap (generic function with 1 method)
157 μs

Adjusted RREF algorithm:

3.1 μs
RREF (generic function with 2 methods)
174 μs
M
5×7 Array{Int64,2}:
 1  1   2   1   0   2  -1
 2  0  -1  -1   0  -1   1
 1  0  -1   1  -1   0   2
 0  0   0   1   2  -1   0
 0  1   2   2   1   2   0
4.3 ms
5×7 Array{Float64,2}:
  1.0  0.0  0.0  0.0   0.0  -0.5  -1.0
 -0.0  1.0  0.0  0.0  -0.0   2.5   6.0
  0.0  0.0  1.0  0.0   0.0   0.0  -3.0
  0.0  0.0  0.0  1.0   0.0   0.0   0.0
  0.0  0.0  0.0  0.0   1.0  -0.5   0.0
136 ms
rref_op_for_M
5.9 ms
true
231 ms
true
28.7 μs
5×5 Array{Float64,2}:
  1.25  -0.25   0.25   0.75  -1.25
 -4.75   2.75  -0.75  -3.25   5.75
  2.0   -1.0    0.0    1.0   -2.0
  0.5   -0.5    0.5    0.5   -0.5
 -0.25   0.25  -0.25   0.25   0.25
79.8 μs

Random walks with matrix multiplication

The last part of today's notebook tries to motivate the utility of matrix multiplication and some of the things we'll do later in the course.

The idea is to analyze a simple random walk.

Suppose you start with x dollars and every minute you play a game, where you have a chance to either win 1 dollar or lose one dollar.

Let's suppose the probability of p(x) of winning is a function of how much money you currently have. The process can stop in two ways: either you lose all of your money, or you win enough times that you reach some pre-determined amount where you decide to quit.

You could think of this as a gambling process, or some kind of competition. Maybe you're playing a sport and your chance of winning the next match depends on your current streak of wins or losses.

We can graph your current state (the amount of money you have) as a function of time.

11.9 μs
starting_amount
20
2 μs
maximum_amount
39
15 μs

We need to assign the upstep probabilities p(x).

The simplest choice is just to set p(x)=0.5 for all x.

6.7 μs
37.2 ms
1.9 ms

Below is an animation of our random walk after some number of steps.

You can examine the code to see how this is implemented.

4 μs
steps
100
1.5 μs
3 s

An essential question to ask is: after 100 steps, how likely are we to have a given amount of money?

We can approximate the answer by simulating our random walk for 100 steps some large number of trials and making a histogram of the possible outcomes.

21 μs
trials
10000
876 ns
8.9 s
1.2 ms

Note the gaps in the above plot. Every other bar is zero because after 100 steps starting at 20, your current amount of money must be even.

4.5 ms

This is time consuming to calculate!

There is a better way.

It turns out that we can compute the exact distribution of states after 100 steps by construction a certain transition matrix T and multiplying this matrix with itself 100 times.

2.3 ms

Here is the transition matrix:

2.3 μs
19.3 ms

Here is T^100:

8.5 μs
41×41 Array{Float64,2}:
 1.0  0.9204107626128211      0.8423820985077439      …  4.6341381160270354e-5   0.0
 0.0  0.001560573282101458    0.0                        1.5769965692431847e-5   0.0
 0.0  0.0                     0.006062226980470583       0.0                     0.0
 0.0  0.004501653698369124    0.0                        5.4648993121546875e-5   0.0
 0.0  0.0                     0.011438164114091454       0.0                     0.0
 0.0  0.006936510415722331    0.0                     …  0.00011722351975012556  0.0
 0.0  0.0                     0.015568612266375709       0.0                     0.0
 ⋮                                                    ⋱                          ⋮
 0.0  0.00011722351975012555  0.0                     …  0.00693651041572233     0.0
 0.0  0.0                     0.00017187251287167243     0.0                     0.0
 0.0  5.4648993121546875e-5   0.0                        0.004501653698369124    0.0
 0.0  0.0                     7.041895881397872e-5       0.0                     0.0
 0.0  1.576996569243185e-5    0.0                        0.0015605732821014585   0.0
 0.0  4.634138116027036e-5    0.00010845272801297256  …  0.9204107626128213      1.0
75.2 μs

The theoretical distribution of states is computed by multiplying T^100 by this starting vector:

6.8 μs
41×1 Array{Float64,2}:
 0.04604406623625023
 0.0
 0.008758339460287078
 0.0
 0.017819383810537857
 0.0
 0.027370695429502278
 ⋮
 0.0
 0.017819383810537864
 0.0
 0.008758339460287074
 0.0
 0.04604406623625023
79.9 μs

The graph below should match closely with empirical distribution we calculated, if the number of trials is large enough.

2.1 μs
1.3 ms

One benefit of the theoretical approach is we can actually figure out the limiting behavior of our random walk as the number of steps grows to infinity. This is not feasible to compute empirically.

2.5 μs
1.5 ms

In the constant probability 1/2 case, this last graph shows that if you continue long enough then you have a 1/2 chance of losing all your money and a 1/2 chance of making it to 40. With sufficient time, every other possible outcome becomes vanishingly unlikely.

10.3 μs

Other choices of probabilities may lead to different long term outcomes.

2.3 μs