Notebook 2 – Math 2121, Fall 2020
Today we will see some plotting and random functions in Julia, as we explore the meaning of (reduced) echelon form and the row reduction algorithm.
x
md"# Notebook 2 -- Math 2121, Fall 2020
Today we will see some plotting and random functions in Julia, as we explore the meaning of (reduced) echelon form and the row reduction algorithm."
Running this notebook (optional)
If you have Pluto up and running, you can access the notebook we are currently viewing by entering this link (right click -> Copy Link) in the Open from file menu in Pluto.
x
md"## Running *this* notebook (optional)
If you have Pluto up and running, you can access the notebook we are currently viewing by entering [this link](http://www.math.ust.hk/~emarberg/teaching/2020/Math2121/julia/02_Math2121_Fall2020.jl) (right click -> Copy Link) in the *Open from file* menu in Pluto.
"
Plotting in Pluto
There are some good packages for making graphics in Julia. Today we'll use Plots.
xxxxxxxxxx
md"## Plotting in Pluto
There are some good packages for making graphics in Julia. Today we'll use _Plots_.
"
xxxxxxxxxx
begin
# This sets up a clean package environment
import Pkg
Pkg.activate(mktempdir())
end
x
begin
# This add s the Plots package to our environment, and imports it
using Plots
plotly()
end
Echelon form and random matrices
Let's write some functions to determine whether a matrix is in echelon form.
x
md"## Echelon form and random matrices
Let's write some functions to determine whether a matrix is in echelon form.
"
find_leading (generic function with 3 methods)
xxxxxxxxxx
begin
# Returns true if row has all zeros.
function is_zero_row(A, row, tolerance=10e-16)
(m, n) = size(A)
return all([abs(A[row,col]) < tolerance for col=1:n])
end
# Returns first index of nonzero entry in row, or 0 if none exists.
function find_leading(A, row=1, tolerance=10e-16)
(m, n) = size(A)
for col=1:n
if abs(A[row, col]) > tolerance
return col
end
end
return 0
end
end
3
xxxxxxxxxx
find_leading([0 0 7 0 5])
in_echelon_form (generic function with 2 methods)
x
function in_echelon_form(A, tolerance=10e-12)
(m, n) = size(A)
# check locations of zero rows
seen_zero_row = false
for row=1:m
if is_zero_row(A, row, tolerance)
seen_zero_row = true
elseif seen_zero_row
return false
end
end
# check alignment of leading entries in nonzero rows
for row=2:m
if is_zero_row(A, row, tolerance)
continue
end
if find_leading(A, row, tolerance) <= find_leading(A, row - 1, tolerance)
return false
end
end
return true
end
in_echelon_form_verbose (generic function with 2 methods)
xxxxxxxxxx
function in_echelon_form_verbose(A, tolerance=10e-12)
(m, n) = size(A)
seen_zero_row = false
for row=1:m
if is_zero_row(A, row, tolerance)
seen_zero_row = true
elseif seen_zero_row
return Text(string("false: row ", row, " is a nonzero row below a zero row"))
end
end
for row=2:m
if is_zero_row(A, row, tolerance)
continue
end
if find_leading(A, row, tolerance) <= find_leading(A, row - 1, tolerance)
return Text(string("false: leading entry in row ", row, " is not to the right of the leading entry in row ", row - 1))
end
end
return Text("true: matrix is in echelon form")
end
in_reduced_echelon_form (generic function with 2 methods)
xxxxxxxxxx
function in_reduced_echelon_form(A, tolerance=10e-12)
(m, n) = size(A)
# check if in echelon form
if ! in_echelon_form(A, tolerance)
return false
end
for row=1:m
if ! is_zero_row(A, row, tolerance)
col = find_leading(A, row, tolerance)
# check that leading entry is 1
if abs(A[row,col] - 1) > tolerance
return false
end
# check that leading entry is only nonzero entry in column
if any([i != row && abs(A[i,col]) > tolerance for i=1:m])
return false
end
end
end
return true
end
in_reduced_echelon_form_verbose (generic function with 2 methods)
xxxxxxxxxx
function in_reduced_echelon_form_verbose(A, tolerance=10e-12)
(m, n) = size(A)
if ! in_echelon_form(A, tolerance)
return Text("false: not in echelon form")
end
for row=1:m
if ! is_zero_row(A, row, tolerance)
col = find_leading(A, row, tolerance)
if abs(A[row,col] - 1) > tolerance
return Text(string("false: leading entry in row ", row, " is not 1"))
end
if any([i != row && abs(A[i,col]) > tolerance for i=1:m])
return Text(string("false: leading entry in row ", row, " is not the only nonzero entry in its column"))
end
end
end
return Text("true: matrix is in reduced echelon form")
end
4×9 Array{Int64,2}:
0 0 5 1 2 3 4 5 6
0 0 0 6 8 9 0 2 0
0 0 0 0 0 0 7 8 9
0 0 0 0 0 0 0 0 0
xxxxxxxxxx
A = [ 0 0 5 1 2 3 4 5 6;
0 0 0 6 8 9 0 2 0;
0 0 0 0 0 0 7 8 9;
0 0 0 0 0 0 0 0 0]
true
x
in_echelon_form(A)
true: matrix is in echelon form
x
in_echelon_form_verbose(A)
false
x
in_reduced_echelon_form(A)
false: leading entry in row 1 is not 1
x
in_reduced_echelon_form_verbose(A)
It's easy to make random matrices using the Random package.
x
md"It's easy to make random matrices using the _Random_ package."
xxxxxxxxxx
using Random
3×5 Array{Int64,2}:
3 2 3 2 2
3 3 2 2 -3
3 -1 0 -3 1
x
B = rand(-3:3, 3, 5)
false: leading entry in row 2 is not to the right of the leading entry in row 1
xxxxxxxxxx
in_echelon_form_verbose(B)
false: not in echelon form
x
in_reduced_echelon_form_verbose(B)
If we generate a random matrix in this way, where each entry is chosen at random from a specified list or range, then what we get is almost never in echelon form. This is because too many entries are nonzero.
For a random matrix to have good chances of being in echelon form, we must make it more likely that a given entry is zero.
Whether a matrix is in echelon form does not depend on the values of its nonzero entries, only their positions (note that this is not true for reduced echelon form). So let's write a function that generates a random matrix with all entries either 0 (zero) or 1 (nonzero).
This function takes a parameter that controls how likely a given entry is to be zero.
xxxxxxxxxx
md"If we generate a random matrix in this way, where each entry is chosen at random from a specified list or range, then what we get is almost never in echelon form. This is because too many entries are nonzero.
For a random matrix to have good chances of being in echelon form, we must make it more likely that a given entry is zero.
Whether a matrix is in echelon form does not depend on the values of its nonzero entries, only their positions (note that this is not true for *reduced* echelon form).
So let's write a function that generates a random matrix with all entries either 0 (zero) or 1 (nonzero).
This function takes a parameter that controls how likely a given entry is to be zero."
random_boolean_matrix (generic function with 1 method)
x
function random_boolean_matrix(m, n, zero_probability)
A = rand(m, n)
for i=1:m
for j=1:n
A[i, j] = Int(A[i, j] > zero_probability)
end
end
return A
end
3×5 Array{Float64,2}:
0.0 1.0 0.0 0.0 1.0
0.0 0.0 1.0 1.0 1.0
0.0 1.0 0.0 1.0 1.0
xxxxxxxxxx
C = random_boolean_matrix(3, 5, 0.5)
false: leading entry in row 3 is not to the right of the leading entry in row 2
xxxxxxxxxx
in_echelon_form_verbose(C)
Now let's try the following experiment.
For different values of p
between 0.0 and 1.0, we'll generate a bunch of random 01-matrices of a given size, where the chance that a given entry is 0 is p
.
We'll graph what proportion of the time this matrix is in echelon form, as a function of p
.
xxxxxxxxxx
md"""
Now let's try the following experiment.
For different values of `p` between 0.0 and 1.0, we'll generate a bunch of random 01-matrices of a given size, where the chance that a given entry is 0 is `p`.
We'll graph what proportion of the time this matrix is in echelon form, as a function of `p`.
"""
Choose dimensions
nrows =
ncols =
xxxxxxxxxx
begin
using PlutoUI
rowslider = nrows Slider(1:10, default=2, show_value=true)
colslider = ncols Slider(1:10, default=3, show_value=true)
md"""**Choose dimensions**
`nrows = `$(rowslider)
`ncols = `$(colslider)
"""
end
1000
ntrials = 1000 # size of bunch to generate for each value of p
2×2 Array{Float64,2}:
0.0 0.0
0.0 0.0
R = random_boolean_matrix(nrows, ncols, 0.5) # typical random matrix of this size
true: matrix is in echelon form
in_echelon_form_verbose(R)
x
begin
# returns average number of times that random 01-matrix is in echelon form
function accumulate_echelon(m, n, zero_prob, trials)
s = 0
for i=1:trials
s += Int(in_echelon_form(random_boolean_matrix(m, n, zero_prob)))
end
return s / trials
end
zero_probs = 0:0.001:1.0
likelihoods = [accumulate_echelon(nrows, ncols, p, ntrials) for p in zero_probs]
plot(
zero_probs,
likelihoods,
xlabel="Probablity p that individual matrix entry is zero",
ylabel="Fraction of $(nrows)-by-$(ncols) matrices",
label=["Echelon form" "Reduced echelon form"],
title="How many random 01-matrices are in echelon form?",
legend=false
)
# plot!(zero_probs, [x^(2) for x in zero_probs])
end
If p == 1.0
then our random matrix is always the zero matrix which is in echelon form. This is why when p
is 1.0 the value of the graph is 1.0.
x
md"If `p == 1.0` then our random matrix is always the *zero matrix* which is in echelon form. This is why when `p` is 1.0 the value of the graph is 1.0."
2×2 Array{Float64,2}:
0.0 0.0
0.0 0.0
x
random_boolean_matrix(nrows, ncols, 1.0)
true: matrix is in echelon form
x
in_echelon_form_verbose(random_boolean_matrix(nrows, ncols, 1.0))
If p == 0.0
then our random matrix is always the matrix of all ones which is never in echelon form if nrows > 1
. This is why when p
is 0.0 the value of the graph is 0.0.
x
md"If `p == 0.0` then our random matrix is always the *matrix of all ones* which is never in echelon form if `nrows > 1`. This is why when `p` is 0.0 the value of the graph is 0.0."
2×2 Array{Float64,2}:
1.0 1.0
1.0 1.0
xxxxxxxxxx
random_boolean_matrix(nrows, ncols, 0.0)
false: leading entry in row 2 is not to the right of the leading entry in row 1
xxxxxxxxxx
in_echelon_form_verbose(random_boolean_matrix(nrows, ncols, 0.0))
Reduction to reduced echelon form
Below is a function RREF
that converts a matrix to its reduced echelon form. You can unhide the code by clicking the icon on the left, if you want to take a look.
x
md"## Reduction to reduced echelon form
Below is a function `RREF` that converts a matrix to its reduced echelon form. You can unhide the code by clicking the icon on the left, if you want to take a look.
"
RREF (generic function with 2 methods)
xxxxxxxxxx
begin
function find_nonzeros_in_column(A, row_start, col, tol=10e-12)
(m, n) = size(A)
return [i for i=row_start:m if abs(A[i,col]) > tol]
end
function columns(A)
(m, n) = size(A)
return n
end
function rowop_replace(A, source_row, target_row, scalar_factor)
source_row != target_row
for j=1:columns(A)
A[target_row,j] += A[source_row,j] * scalar_factor
end
return A
end
function rowop_scale(A, target_row, scalar_factor)
scalar_factor != 0
for j=1:columns(A)
A[target_row,j] *= scalar_factor
end
A
end
function rowop_swap(A, s, t)
for j=1:columns(A)
A[t,j], A[s,j] = A[s,j], A[t,j]
end
A
end
function RREF_with_rowop_count(A, tol=10e-12)
rowop_count = 0
A = float(copy(A))
(m, n) = size(A)
row = 1
for col=1:n
nonzeros = find_nonzeros_in_column(A, row, col, tol)
if length(nonzeros) == 0
continue
end
i = nonzeros[1]
if abs(A[i, col] - 1) < tol
A[i, col] = 1
else
A = rowop_scale(A, i, 1 / A[i, col])
rowop_count += 1
end
for j=nonzeros[2:end]
A = rowop_replace(A, i, j, -A[j, col])
rowop_count += 1
end
if i != row
A = rowop_swap(A, row, i)
rowop_count += 1
end
row += 1
end
for row=m:-1:1
l = find_leading(A, row, tol)
if l > 0
for k=1:row - 1
if abs(A[k,l]) > tol
rowop_replace(A, row, k, -A[k,l])
rowop_count += 1
end
end
end
end
# round entries to give nicer output; could omit this step
for row=1:m
for col=1:n
if abs(A[row,col] - round(A[row,col])) < tol
A[row,col] = round(A[row,col])
end
end
end
return A, rowop_count
end
function RREF_RowOpCount(A, tol=10e-12)
A, ops = RREF_with_rowop_count(A, tol)
return ops
end
function RREF(A, tol=10e-12)
A, ops = RREF_with_rowop_count(A, tol)
return A
end
end
Here is some more code to compute and count the pivot positions in a matrix.
xxxxxxxxxx
md"Here is some more code to compute and count the pivot positions in a matrix."
pivot_positions (generic function with 2 methods)
x
function pivot_positions(A, tol=10e-12)
rref = RREF(A, tol)
return [
(i, find_leading(rref, i))
for i=1:size(A)[1] if find_leading(rref, i) > 0
]
end
npivots (generic function with 2 methods)
x
function npivots(A, tol=10e-12)
return length(pivot_positions(A, tol))
end
3×4 Array{Int64,2}:
0 3 -6 6
3 -7 8 -5
3 -9 12 -9
x
M = [0 3 -6 6; 3 -7 8 -5; 3 -9 12 -9]
3×4 Array{Float64,2}:
1.0 0.0 -2.0 3.0
0.0 1.0 -2.0 2.0
0.0 0.0 -0.0 0.0
xxxxxxxxxx
RREF(M)
true
xxxxxxxxxx
in_reduced_echelon_form(RREF(M))
1
1
2
2
x
pivot_positions(M)
2
xxxxxxxxxx
npivots(M)
How many pivot positions does a random N-by-N matrix have?
The possible values range from zero to N.
xxxxxxxxxx
md"How many pivot positions does a random N-by-N matrix have?
The possible values range from zero to N.
"
M =
N =
x
begin
rslider = n1 Slider(1:20, default=2, show_value=true)
sslider = n2 Slider(1:20, default=2, show_value=true)
md"""
`M = `$(rslider)
`N = `$(sslider)
"""
end
0
1
xxxxxxxxxx
value_range = 0:1
0
0
1
23
301
675
xxxxxxxxxx
begin
pivot_counts = [0 for i=0:n1]
for i=1:1000
pivot_counts[1 + npivots(rand(value_range, n1, n2))] += 1
end
pivot_counts
end
xxxxxxxxxx
pie(pivot_counts, legend = false, title="Counting $(n1)-by-$(n2) matrices by number of pivot positions")
We see from these examples that almost every random matrix, of sufficiently large size with sufficiently unconstrained entries, has the largest possible number of pivot positions (which is the whichever of M or N is larger).
xxxxxxxxxx
md"We see from these examples that almost every random matrix, of sufficiently large size with sufficiently unconstrained entries, has the largest possible number of pivot positions (which is the whichever of M or N is larger)."
Effort to compute RREF(A)
One final experiment for today.
Given a typical matrix A, how many row operations should we have to perform to compute RREF(A)? Could we get unlucky and have to do many more operations in one case compared to another?
To investigate, let's do the following. To simplify things, we'll just consider square matrices, and we'll just generate random 01-matrices as above.
For each probabilty p controlling how likely a given matrix entry is zero, we will generate a large number of random 01-matrices.
For these matrices A, we'll calculate the average number of row operations needed to compute RREF(A)
and the standard deviation. Then we can plot both as a function of p.
xxxxxxxxxx
md"""## Effort to compute `RREF(A)`
One final experiment for today.
Given a typical matrix A, how many row operations should we have to perform to compute RREF(A)? Could we get unlucky and have to do many more operations in one case compared to another?
To investigate, let's do the following. To simplify things, we'll just consider square matrices, and we'll just generate random 01-matrices as above.
For each probabilty p controlling how likely a given matrix entry is zero, we will generate a large number of random 01-matrices.
For these matrices A, we'll calculate the average number of row operations needed to compute `RREF(A)` and the standard deviation. Then we can plot both as a function of p.
"""
1000
xxxxxxxxxx
rtrials = 1000
N =
xxxxxxxxxx
begin
zslider = z Slider(2:200, default=2, show_value=true)
md"""
`N = `$(zslider)
"""
end
x
begin
function accumulate_rowops(m, n, zero_prob, trials)
s = 0
for i=1:trials
# s += RREF_RowOpCount(rand(-1:1,m, n))
s += RREF_RowOpCount(random_boolean_matrix(m, n, zero_prob))
end
mean = s / trials
return mean
end
function accumulate_rowops_std(m, n, zero_prob, trials, mean)
s = 0
for i=1:trials
# s += (RREF_RowOpCount(rand(-1:1, m, n)) - mean)^2
s += (RREF_RowOpCount(random_boolean_matrix(m, n, zero_prob)) - mean)^2
end
std = sqrt(s / (trials - 1))
return std
end
x = 0:0.01:1
k = z
averages = [accumulate_rowops(k, k, p, rtrials) for p in x]
deviations = [
accumulate_rowops_std(k, k, x[i], rtrials, averages[i])
for i=1:length(x)
]
plot(
x,
hcat(averages, deviations),
xlabel="Probablity p that individual matrix entry is zero",
ylabel="Number of row operations",
title="RowOps to compute RREF for $(k)-by-$(k) 01-matrices",
label=["mean" "std"]
)
end
Moral of the story: if the size N of our matrix is large, then at almost all values of p the average number of row operations needed to compute RREF(A)
is nearly the same, and the standard deviation is also flat.
This means that if N is large then in some sense we can't get too unlucky: all computations of RREF(A)
will take close to the same number of operations.
But this is only in the limit. When N is small, there is some interesting variation in the number of row operations needed.
x
md"Moral of the story: if the size N of our matrix is large, then at almost all values of p the average number of row operations needed to compute `RREF(A)` is nearly the same, and the standard deviation is also flat.
This means that if N is large then in some sense we can't get too unlucky: all computations of `RREF(A)` will take close to the same number of operations.
But this is only in the limit. When N is small, there is some interesting variation in the number of row operations needed."
Enter cell code...
xxxxxxxxxx