Lazy arrays and linear algebra in Julia
This package supports lazy analogues of array operations like vcat, hcat,
and multiplication. This helps with the implementation of matrix-free methods
for iterative solvers.
The package has been designed with high-performance in mind, so should outperform
the non-lazy analogues from Base for many operations like copyto! and broadcasting.
Some operations will be inherently slower due to extra computation, like getindex.
Please file an issue for any examples that are significantly slower than their
the analogue in Base.
To construct a lazy representation of a function call f(x,y,z...), use the command
applied(f, x, y, z...). This will return an unmaterialized object typically of type Applied
that represents the operation. To realize that object, call materialize, which
will typically be equivalent to calling f(x,y,z...). A macro @~ is available as a shorthand:
julia> using LazyArrays, LinearAlgebra
julia> applied(exp, 1)
Applied(exp,1)
julia> materialize(applied(exp, 1))
2.718281828459045
julia> materialize(@~ exp(1))
2.718281828459045
julia> exp(1)
2.718281828459045Note that @~ causes sub-expressions to be wrapped in an applied call, not
just the top-level expression. This can lead to MethodErrors when lazy
application of sub-expressions is not yet implemented. For example,
julia> @~ Vector(1:10) .* ones(10)'
ERROR: MethodError: ...
julia> A = Vector(1:10); B = ones(10); (@~ sum(A .* B')) |> materialize
550.0The benefit of lazy operations is that they can be materialized in-place,
possible using simplifications. For example, it is possible to
do BLAS-like Matrix-Vector operations of the form α*A*x + β*y as
implemented in BLAS.gemv! using a lazy applied object:
julia> A = randn(5,5); b = randn(5); c = randn(5); d = similar(c);
julia> d .= @~ 2.0 * A * b + 3.0 * c # Calls gemv!
5-element Vector{Float64}:
-2.5366335879717514
-5.305097174484744
-9.818431932350942
2.421562605495651
0.26792916096572983
julia> 2*(A*b) + 3c
5-element Vector{Float64}:
-2.5366335879717514
-5.305097174484744
-9.818431932350942
2.421562605495651
0.26792916096572983
julia> function mymul(A, b, c, d) # need to put in function for benchmarking
d .= @~ 2.0 * A * b + 3.0 * c
end
mymul (generic function with 1 method)
julia> using BenchmarkTools
julia> @btime mymul(A, b, c, d) # calls gemv!
77.444 ns (0 allocations: 0 bytes)
5-element Vector{Float64}:
-2.5366335879717514
-5.305097174484744
-9.818431932350942
2.421562605495651
0.26792916096572983
julia> @btime 2*(A*b) + 3c; # does not call gemv!
241.659 ns (4 allocations: 512 bytes)This also works for inverses, which lower to BLAS calls whenever possible:
julia> A = randn(5,5); b = randn(5); c = similar(b);
julia> c .= @~ A \ b
5-element Vector{Float64}:
-2.5366335879717514
-5.305097174484744
-9.818431932350942
2.421562605495651
0.26792916096572983Often we want lazy realizations of matrices, which are supported via ApplyArray.
For example, the following creates a lazy matrix exponential:
julia> E = ApplyArray(exp, [1 2; 3 4])
exp(2×2 Matrix{Int64}):
51.969 74.7366
112.105 164.074 A lazy matrix exponential is useful for, say, in-place matrix-exponential*vector:
julia> b = Vector{Float64}(undef, 2); b .= @~ E*[4,4]
2-element Vector{Float64}:
506.8220830628333
1104.7145995988594While this works, it is not actually optimised (yet).
Other options do have special implementations that make them fast. We now give some examples.
Lazy vcat and hcat allow for representing the concatenation of
vectors without actually allocating memory, and support a fast
copyto! for allocation-free population of a vector.
julia> using BenchmarkTools
julia> A = ApplyArray(vcat,1:5,2:3) # allocation-free
vcat(5-element UnitRange{Int64}, 2-element UnitRange{Int64}):
1
2
3
4
5
2
3
julia> Vector(A) == vcat(1:5, 2:3)
true
julia> b = Array{Int}(undef, length(A)); @btime copyto!(b, A);
26.670 ns (0 allocations: 0 bytes)
julia> @btime vcat(1:5, 2:3); # takes twice as long due to memory creation
43.336 ns (1 allocation: 144 bytes)Similar is the lazy analogue of hcat:
julia> A = ApplyArray(hcat, 1:3, randn(3,10))
hcat(3-element UnitRange{Int64}, 3×10 Matrix{Float64}):
1.0 1.16561 0.224871 -1.36416 -0.30675 0.103714 0.590141 0.982382 -1.50045 0.323747 -1.28173
2.0 1.04648 1.35506 -0.147157 0.995657 -0.616321 -0.128672 -0.671445 -0.563587 -0.268389 -1.71004
3.0 -0.433093 -0.325207 -1.38496 -0.391113 -0.0568739 -1.55796 -1.00747 0.473686 -1.2113 0.0119156
julia> Matrix(A) == hcat(A.args...)
true
julia> B = Array{Float64}(undef, size(A)...); @btime copyto!(B, A);
109.625 ns (1 allocation: 32 bytes)
julia> @btime hcat(A.args...); # takes twice as long due to memory creation
274.620 ns (6 allocations: 560 bytes)We can represent Kronecker products of arrays without constructing the full array:
julia> A = randn(2,2); B = randn(3,3);
julia> K = ApplyArray(kron,A,B)
kron(2×2 Matrix{Float64}, 3×3 Matrix{Float64}):
-1.08736 -0.19547 -0.132824 1.60531 0.288579 0.196093
0.353898 0.445557 -0.257776 -0.522472 -0.657791 0.380564
-0.723707 0.911737 -0.710378 1.06843 -1.34603 1.04876
1.40606 0.252761 0.171754 -0.403809 -0.0725908 -0.0493262
-0.457623 -0.576146 0.333329 0.131426 0.165464 -0.0957293
0.935821 -1.17896 0.918584 -0.26876 0.338588 -0.26381
julia> C = Matrix{Float64}(undef, 6, 6); @btime copyto!(C, K);
61.528 ns (0 allocations: 0 bytes)
julia> C == kron(A,B)
trueBase includes a lazy broadcast object called Broadcasting, but this is
not a subtype of AbstractArray. Here we have BroadcastArray which replicates
the functionality of Broadcasting while supporting the array interface.
julia> A = randn(6,6);
julia> B = BroadcastArray(exp, A);
julia> Matrix(B) == exp.(A)
true
julia> B = BroadcastArray(+, A, 2);
julia> B == A .+ 2
trueSuch arrays can also be created using the macro @~ which acts on ordinary
broadcasting expressions combined with LazyArray:
julia> C = rand(1000)';
julia> D = LazyArray(@~ exp.(C))
julia> E = LazyArray(@~ @. 2 + log(C))
julia> @btime sum(LazyArray(@~ C .* C'); dims=1) # without `@~`, 1.438 ms (5 allocations: 7.64 MiB)
74.425 μs (7 allocations: 8.08 KiB)