Sparse Array Utilities
Sparse Constructors
In addition to the Tensor
constructor, Finch provides a number of convenience constructors for common tensor types. For example, the spzeros
and sprand
functions have fspzeros
and fsprand
counterparts that return Finch tensors. We can also construct a sparse COO Tensor
from a list of indices and values using the fsparse
function.
Finch.fsparse
— Functionfsparse(I::Tuple, V,[ M::Tuple, combine]; fill_value=zero(eltype(V)))
Create a sparse COO tensor S
such that size(S) == M
and S[(i[q] for i = I)...] = V[q]
. The combine function is used to combine duplicates. If M
is not specified, it is set to map(maximum, I)
. If the combine function is not supplied, combine defaults to +
unless the elements of V are Booleans in which case combine defaults to |
. All elements of I must satisfy 1 <= I[n][q] <= M[n]. Numerical zeros are retained as structural nonzeros; to drop numerical zeros, use dropzeros!.
See also: sparse
Examples
julia> I = ( [1, 2, 3], [1, 2, 3], [1, 2, 3]);
julia> V = [1.0; 2.0; 3.0];
julia> fsparse(I, V) SparseCOO (0.0) [1:3×1:3×1:3] │ │ │ └─└─└─[1, 1, 1] [2, 2, 2] [3, 3, 3] 1.0 2.0 3.0
Finch.fsparse!
— Functionfsparse!(I..., V,[ M::Tuple])
Like fsparse
, but the coordinates must be sorted and unique, and memory is reused.
Finch.fsprand
— Functionfsprand([rng],[type], M..., p, [rfn])
Create a random sparse tensor of size m
in COO format. There are two cases: - If p
is floating point, the probability of any element being nonzero is independently given by p
(and hence the expected density of nonzeros is also p
). - If p
is an integer, exactly p
nonzeros are distributed uniformly at random throughout the tensor (and hence the density of nonzeros is exactly p / prod(M)
). Nonzero values are sampled from the distribution specified by rfn
and have the type type
. The uniform distribution is used in case rfn
is not specified. The optional rng
argument specifies a random number generator.
See also: (sprand
)(https://docs.julialang.org/en/v1/stdlib/SparseArrays/#SparseArrays.sprand)
Examples
julia> fsprand(Bool, 3, 3, 0.5)
SparseCOO (false) [1:3,1:3]
├─├─[1, 1]: true
├─├─[3, 1]: true
├─├─[2, 2]: true
├─├─[3, 2]: true
├─├─[3, 3]: true
julia> fsprand(Float64, 2, 2, 2, 0.5)
SparseCOO (0.0) [1:2,1:2,1:2]
├─├─├─[2, 2, 1]: 0.6478553157718558
├─├─├─[1, 1, 2]: 0.996665291437684
├─├─├─[2, 1, 2]: 0.7491940599574348
Finch.fspzeros
— Functionfspzeros([type], M...)
Create a random zero tensor of size M
, with elements of type type
. The tensor is in COO format.
See also: (spzeros
)(https://docs.julialang.org/en/v1/stdlib/SparseArrays/#SparseArrays.spzeros)
Examples
julia> fspzeros(Bool, 3, 3)
3×3-Tensor
└─ SparseCOO{2} (false) [:,1:3]
julia> fspzeros(Float64, 2, 2, 2)
2×2×2-Tensor
└─ SparseCOO{3} (0.0) [:,:,1:2]
Finch.ffindnz
— Functionffindnz(arr)
Return the nonzero elements of arr
, as Finch understands arr
. Returns (I..., V)
, where I
are the coordinate vectors, one for each mode of arr
, and V
is a vector of corresponding nonzero values, which can be passed to fsparse
.
See also: (findnz
)(https://docs.julialang.org/en/v1/stdlib/SparseArrays/#SparseArrays.findnz)
Fill Values
Finch tensors support an arbitrary "background" value for sparse arrays. While most arrays use 0
as the background value, this is not always the case. For example, a sparse array of Int
might use typemin(Int)
as the background value. The default
function returns the background value of a tensor. If you ever want to change the background value of an existing array, you can use the set_fill_value!
function. The countstored
function returns the number of stored elements in a tensor, and calling pattern!
on a tensor returns tensor which is true whereever the original tensor stores a value. Note that countstored doesn't always return the number of non-zero elements in a tensor, as it counts the number of stored elements, and stored elements may include the background value. You can call dropfills!
to remove explicitly stored background values from a tensor.
julia> A = fsparse([1, 1, 2, 3], [2, 4, 5, 6], [1.0, 2.0, 3.0])
3×6-Tensor
└─ SparseCOO{2} (0.0) [:,1:6]
├─ [1, 2]: 1.0
├─ [1, 4]: 2.0
└─ [2, 5]: 3.0
julia> min.(A, -1)
3×6-Tensor
└─ Dense [:,1:6]
├─ [:, 1]: Dense [1:3]
│ ├─ [1]: -1.0
│ ├─ [2]: -1.0
│ └─ [3]: -1.0
├─ [:, 2]: Dense [1:3]
│ ├─ [1]: -1.0
│ ├─ [2]: -1.0
│ └─ [3]: -1.0
├─ ⋮
├─ [:, 5]: Dense [1:3]
│ ├─ [1]: -1.0
│ ├─ [2]: -1.0
│ └─ [3]: -1.0
└─ [:, 6]: Dense [1:3]
├─ [1]: -1.0
├─ [2]: -1.0
└─ [3]: -1.0
julia> fill_value(A)
0.0
julia> B = set_fill_value!(A, -Inf)
3×6-Tensor
└─ SparseCOO{2} (-Inf) [:,1:6]
├─ [1, 2]: 1.0
├─ [1, 4]: 2.0
└─ [2, 5]: 3.0
julia> min.(B, -1)
3×6-Tensor
└─ SparseDict (-Inf) [:,1:6]
├─ [:, 2]: SparseDict (-Inf) [1:3]
│ └─ [1]: -1.0
├─ [:, 4]: SparseDict (-Inf) [1:3]
│ └─ [1]: -1.0
└─ [:, 5]: SparseDict (-Inf) [1:3]
└─ [2]: -1.0
julia> countstored(A)
3
julia> pattern!(A)
3×6-Tensor
└─ SparseCOO{2} (false) [:,1:6]
├─ [1, 2]: true
├─ [1, 4]: true
└─ [2, 5]: true
Finch.set_fill_value!
— Functionset_fill_value!(fbr, init)
Return a tensor which is equal to fbr
, but with the fill (implicit) value set to init
. May reuse memory and render the original tensor unusable when modified.
julia> A = Tensor(SparseList(Element(0.0), 10), [2.0, 0.0, 3.0, 0.0, 4.0, 0.0, 5.0, 0.0, 6.0, 0.0])
10-Tensor
└─ SparseList (0.0) [1:10]
├─ [1]: 2.0
├─ [3]: 3.0
├─ ⋮
├─ [7]: 5.0
└─ [9]: 6.0
julia> set_fill_value!(A, Inf)
10-Tensor
└─ SparseList (Inf) [1:10]
├─ [1]: 2.0
├─ [3]: 3.0
├─ ⋮
├─ [7]: 5.0
└─ [9]: 6.0
Finch.pattern!
— Functionpattern!(fbr)
Return the pattern of fbr
. That is, return a tensor which is true wherever fbr
is structurally unequal to its fill_value. May reuse memory and render the original tensor unusable when modified.
julia> A = Tensor(SparseList(Element(0.0), 10), [2.0, 0.0, 3.0, 0.0, 4.0, 0.0, 5.0, 0.0, 6.0, 0.0])
10-Tensor
└─ SparseList (0.0) [1:10]
├─ [1]: 2.0
├─ [3]: 3.0
├─ ⋮
├─ [7]: 5.0
└─ [9]: 6.0
julia> pattern!(A)
10-Tensor
└─ SparseList (false) [1:10]
├─ [1]: true
├─ [3]: true
├─ ⋮
├─ [7]: true
└─ [9]: true
Finch.countstored
— Functioncountstored(arr)
Return the number of stored elements in arr
. If there are explicitly stored fill elements, they are counted too.
See also: (SparseArrays.nnz
)(https://docs.julialang.org/en/v1/stdlib/SparseArrays/#SparseArrays.nnz) and (Base.summarysize
)(https://docs.julialang.org/en/v1/base/base/#Base.summarysize)
Finch.dropfills
— Functiondropfills(src)
Drop the fill values from src
and return a new tensor with the same shape and format.
Finch.dropfills!
— Functiondropfills!(dst, src)
Copy only the non-fill values from src
into dst
.
How to tell whether an entry is "fill"
In the sparse world, a semantic distinction is sometimes made between "explicitly stored" values and "implicit" or "fill" values (usually zero). However, the formats in the Finch compiler represent a diverse set of structures beyond sparsity, and it is often unclear whether any of the values in the tensor are "explicit" (consider a mask matrix, which can be represented with a constant number of bits). Thus, Finch makes no semantic distinction between values which are stored explicitly or not. If users wish to make this distinction, they should instead store a tensor of tuples of the form (value, is_fill)
. For example,
julia> A = fsparse([1, 1, 2, 3], [2, 4, 5, 6], [(1.0, false), (0.0, true), (3.0, false)]; fill_value=(0.0, true))
3×6-Tensor
└─ SparseCOO{2} ((0.0, true)) [:,1:6]
├─ [1, 2]: (1.0, false)
├─ [1, 4]: (0.0, true)
└─ [2, 5]: (3.0, false)
julia> B = Tensor(Dense(SparseList(Element((0.0, true)))), A)
3×6-Tensor
└─ Dense [:,1:6]
├─ [:, 1]: SparseList ((0.0, true)) [1:3]
├─ [:, 2]: SparseList ((0.0, true)) [1:3]
│ └─ [1]: (1.0, false)
├─ ⋮
├─ [:, 5]: SparseList ((0.0, true)) [1:3]
│ └─ [2]: (3.0, false)
└─ [:, 6]: SparseList ((0.0, true)) [1:3]
julia> sum(map(last, B))
16
julia> sum(map(first, B))
4.0