Skip to content

Commit aa2f81f

Browse files
authored
Merge pull request #16 from JuliaLinearAlgebra/teh/docs
Minor documentation improvements and workflow updates
2 parents da75cd4 + f8db00d commit aa2f81f

File tree

5 files changed

+44
-13
lines changed

5 files changed

+44
-13
lines changed

.github/dependabot.yml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# https://docs.github.com/github/administering-a-repository/configuration-options-for-dependency-updates
2+
version: 2
3+
updates:
4+
- package-ecosystem: "github-actions"
5+
directory: "/" # Location of package manifests
6+
schedule:
7+
interval: "weekly"

.github/workflows/CI.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,19 +13,19 @@ jobs:
1313
fail-fast: false
1414
matrix:
1515
version:
16-
- '1.7'
16+
- 'min'
1717
- '1'
1818
os:
1919
- ubuntu-latest
2020
arch:
2121
- x64
2222
steps:
2323
- uses: actions/checkout@v2
24-
- uses: julia-actions/setup-julia@v1
24+
- uses: julia-actions/setup-julia@v2
2525
with:
2626
version: ${{ matrix.version }}
2727
arch: ${{ matrix.arch }}
28-
- uses: actions/cache@v1
28+
- uses: actions/cache@v4
2929
env:
3030
cache-name: cache-artifacts
3131
with:

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "IncrementalSVD"
22
uuid = "de227602-7e15-40a7-b166-bbaff82a52b8"
33
authors = ["Tim Holy <[email protected]>"]
4-
version = "1.0.0"
4+
version = "1.0.1"
55

66
[deps]
77
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"

README.md

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,13 @@
33
IncrementalSVD provides incremental (updating) singular value decomposition.
44
This allows you to update an existing SVD with new columns, and even implement
55
online SVD with streaming data.
6+
For cheap approximations of the SVD on large data, it can be orders-of-magnitude
7+
more accurate than techniques involving random projection.
68

79
## All-at-once usage
810

9-
For reasons that will be described below, if you want a truncated SVD and your matrix is small enough to fit in memory,
10-
you're better off using [TSVD](https://github.com/JuliaLinearAlgebra/TSVD.jl). However, IncrementalSVD can do it too:
11+
If you want a truncated SVD and your matrix is small enough to fit in memory,
12+
you can use IncrementalSVD like this:
1113

1214
```julia
1315
julia> using IncrementalSVD, LinearAlgebra
@@ -21,8 +23,11 @@ julia> Vt = Diagonal(s) \ (U' * X);
2123

2224
Note that `Vt` is *not* returned by `isvd`; for reasons described [below](#on-the-fly-v) we compute it afterwards.
2325

24-
`isvd` uses incremental updating, which is lossy to an extent that depends on the distribution of singular values.
25-
For comparison:
26+
In typical cases, `isvd` returns a (good) *approximation* of the true SVD.
27+
This is in contrast with packages like
28+
[TSVD](https://github.com/JuliaLinearAlgebra/TSVD.jl) which return an exact
29+
(within numerical precision) answer.
30+
Let's compare the error of the rank-4 approximation computed by `isvd` with that computed by TSVD:
2631

2732
```julia
2833
julia> using TSVD
@@ -36,9 +41,10 @@ julia> norm(X - U2*Diagonal(s2)*V2')
3641
1.9177860422120783
3742
```
3843
In this particular case, the rank-4 absolute error with TSVD is a few percent better than with IncrementalSVD.
39-
The error of incremental SVD comes from the fact that it works on chunks, and there is a truncation step after each chunk that discards information; see [Brand 2006](#references) Eq 5 for more insight.
44+
The error of incremental SVD comes from the fact that it works on chunks, and after each chunk any excess components are truncated, resulting in a loss of information.
45+
See [Brand 2006](#references) Eq 5 for more insight.
4046

41-
However, the *real* use-case for IncrementalSVD is in computing incremental updates or handling cases where `X` is too large to fit in memory all at once, and for such applications it handily beats alternatives like random projection + power iteration (e.g., `rsvd` from [RandomizedLinAlg.jl](https://github.com/JuliaLinearAlgebra/RandomizedLinAlg.jl)).
47+
However, the *real* use-case for IncrementalSVD is in computing incremental updates or handling cases where `X` is too large to fit in memory all at once, and for such applications it handily beats alternatives like random projection + power iteration (e.g., `rsvd` from [RandomizedLinAlg.jl](https://github.com/JuliaLinearAlgebra/RandomizedLinAlg.jl)). See details below.
4248

4349
## Incremental updates
4450

@@ -63,7 +69,11 @@ julia> s
6369
4.18050301615471
6470
3.662876466035874
6571
2.923979120208828
72+
```
73+
74+
For comparison, the true answer is:
6675

76+
```julia
6777
julia> F = svd(X);
6878

6979
julia> F.S
@@ -75,6 +85,8 @@ julia> F.S
7585
1.7956053622541457
7686
```
7787

88+
The singular values computed by `update!` were accurate to 3-5 digits.
89+
7890
`isvd` is just a thin wrapper over this basic iterative update.
7991

8092
## Reducing error

src/IncrementalSVD.jl

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,19 @@ The public functions are:
1414
- [`IncrementalSVD.Cache`](@ref) (not exported)
1515
""" IncrementalSVD
1616

17+
"""
18+
U, s = isvd(X::AbstractMatrix{<:Real}, nc)
19+
20+
Compute an incremental thin SVD of the matrix `X`, returning the left singular
21+
vectors `U` and the singular values `s`. The number of retained components
22+
is specified by `nc`.
23+
24+
`V` may be obtained via `V = (X' * U) / Diagonal(s)`.
25+
26+
`isvd` is just a wrapper around repeated calls to [`IncrementalSVD.update!`](@ref).
27+
In cases with streaming or large `X` that cannot fit into memory, you may prefer
28+
to use `IncrementalSVD.update!` directly with smaller chunks of `X`.
29+
"""
1730
function isvd(X::AbstractMatrix{<:Real}, nc)
1831
Base.require_one_based_indexing(X)
1932
T = float(eltype(X))
@@ -80,9 +93,8 @@ computation of the SVD. `U` and `s` are updated in-place as well as returned.
8093
You can reuse temporary storage by creating `cache` (see [`IncrementalSVD.Cache`](@ref)).
8194
8295
There are two ways to initialize:
83-
- `U, s, V = zeros(T, m, r), zeros(T, r), zeros(T, n, r)`. This specifies
84-
the element type `T`, the number of rows `m`, the rank `r`, and the number
85-
of columns `n`. If you're computing `V`, this is the only option.
96+
- `U, s = zeros(T, m, r), zeros(T, r)`. This specifies the element type `T`, the
97+
number of rows `m` and the rank `r`.
8698
- `U, s = nothing, nothing`. This will use `size(U) = size(A)`, i.e.,
8799
the chunk size specifies the truncated rank.
88100

0 commit comments

Comments
 (0)