Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slim down readme and improve docs page #621

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
218 changes: 12 additions & 206 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,209 +7,15 @@
[![Aqua QA](https://raw.githubusercontent.com/JuliaTesting/Aqua.jl/master/badge.svg)](https://github.com/JuliaTesting/Aqua.jl)


JLD2 saves and loads Julia data structures in a format comprising a subset of HDF5, without any dependency on the HDF5 C library.
JLD2 is able to read most HDF5 files created by other HDF5 implementations supporting HDF5 File Format Specification Version 3.0 (i.e. libhdf5 1.10 or later) and similarly those should be able to read the files that JLD2 produces. JLD2 provides read-only support for files created with the JLD package.

## Reading and writing data
### `save` and `load` functions

The `save` and `load` functions, provided by [FileIO](https://github.com/JuliaIO/FileIO.jl), provide a mechanism to read and write data from a JLD2 file. To use these functions, you may either write `using FileIO` or `using JLD2`. FileIO will determine the correct package automatically.

The `save` function accepts an `AbstractDict` yielding the key/value pairs, where the key is a string representing the name of the dataset and the value represents its contents:

```julia
using FileIO
save("example.jld2", Dict("hello" => "world", "foo" => :bar))
```

The `save` function can also accept the dataset names and contents as arguments:

```julia
save("example.jld2", "hello", "world", "foo", :bar)
```

When using the `save` function, the file extension must be `.jld2`, since the extension `.jld` currently belongs to the previous JLD package.

If called with a filename argument only, the `load` function loads all datasets from the given file into a Dict:

```julia
load("example.jld2") # -> Dict{String,Any}("hello" => "world", "foo" => :bar)
```

If called with a single dataset name, `load` returns the contents of that dataset from the file:

```julia
load("example.jld2", "hello") # -> "world"
```

If called with multiple dataset names, `load` returns the contents of the given datasets as a tuple:

```julia
load("example.jld2", "hello", "foo") # -> ("world", :bar)
```

### A new interface: jldsave

`jldsave` makes use of julia's keyword argument syntax to store files,
thus leveraging the parser and not having to rely on macros. The new interface can be imported with `using JLD2`. To use it, write

```julia
using JLD2

x = 1
y = 2
z = 42

# The simplest case:
jldsave("example.jld2"; x, y, z)
# it is equivalent to
jldsave("example.jld2"; x=x, y=y, z=z)

# You can assign new names selectively
jldsave("example.jld2"; x, a=y, z)

# and if you want to confuse your future self and everyone else, do
jldsave("example.jld2"; z=x, x=y, y=z)
```

In the above examples, `;` after the filename is important. Compression and non-default IO types may be set via positional arguments like:
```
jldopen("example.jld2", "w"; compress = true) do f
f["large_array"] = zeros(10000)
end
```

### File interface

It is also possible to interact with JLD2 files using a file-like interface. The `jldopen` function accepts a file name and an argument specifying how the file should be opened:

```julia
using JLD2

f = jldopen("example.jld2", "r") # open read-only (default)
f = jldopen("example.jld2", "r+") # open read/write, failing if no file exists
f = jldopen("example.jld2", "w") # open read/write, overwriting existing file
f = jldopen("example.jld2", "a+") # open read/write, preserving contents of existing file or creating a new file
```

Data can be written to the file using `write(f, "name", data)` or `f["name"] = data`, or read from the file using `read(f, "name")` or `f["name"]`. When you are done with the file, remember to call `close(f)`.

Like `open`, `jldopen` also accepts a function as the first argument, permitting `do`-block syntax:

```julia
jldopen("example.jld2", "w") do file
file["bigdata"] = randn(5)
end
```

### Groups

It is possible to construct groups within a JLD2 file, which may or may not be useful for organizing your data. You can create groups explicitly:

```julia
jldopen("example.jld2", "w") do file
mygroup = JLD2.Group(file, "mygroup")
mygroup["mystuff"] = 42
end
```

or implicitly, by saving a variable with a name containing slashes as path delimiters:

```julia
jldopen("example.jld2", "w") do file
file["mygroup/mystuff"] = 42
end
# or save("example.jld2", "mygroup/mystuff", 42)
```

Both of these examples yield the same group structure, which you can see at the REPL:

```
julia> file = jldopen("example.jld2", "r")
JLDFile /Users/simon/example.jld2 (read-only)
└─📂 mygroup
└─🔢 mystuff
```

Similarly, you can access groups directly:

```julia
jldopen("example.jld2", "r") do file
@assert file["mygroup"]["mystuff"] == 42
end
```

or using slashes as path delimiters:

```julia
@assert load("example.jld2", "mygroup/mystuff") == 42
```

When loading files with nested groups these will be unrolled into paths by default but
yield nested dictionaries but with the `nested` keyword argument.
```julia
load("example.jld2") # -> Dict("mygroup/mystuff" => 42)
load("example.jld2"; nested=true) # -> Dict("mygroup" => Dict("mystuff" => 42))
```

### Custom Serialization

The API is simple enough, to enable custom serialization for your type `A` you define
a new type e.g. `ASerialization` that contains the fields you want to store and define
`JLD2.writeas(::Type{A}) = ASerialization`.
Internally JLD2 will call `Base.convert` when writing and loading, so you need to make sure to extend that for your type.

```julia
struct A
x::Int
end

struct ASerialization
x::Vector{Int}
end

JLD2.writeas(::Type{A}) = ASerialization
Base.convert(::Type{ASerialization}, a::A) = ASerialization([a.x])
Base.convert(::Type{A}, a::ASerialization) = A(only(a.x))
```

If you do not want to overload `Base.convert` then you can also define

```julia
JLD2.wconvert(::Type{ASerialization}, a::A) = ASerialization([a.x])
JLD2.rconvert(::Type{A}, a::ASerialization) = A(only(a.x))
```

instead. This may be particularly relevant when types are involved that are not your own.

```julia
struct B
x::Float64
end

JLD2.writeas(::Type{B}) = Float64
JLD2.wconvert(::Type{Float64}, b::B) = b.x
JLD2.rconvert(::Type{B}, x::Float64) = B(x)

arr = [B(rand()) for i=1:10]

jldsave("test.jld2"; arr)
```

In this example JLD2 converts the array of `B` structs to a plain `Vector{Float64}` prior to
storing to disk.

### Unpack.jl API

When additionally loading the [UnPack.jl](https://github.com/mauro3/UnPack.jl) package, its `@unpack` and `@pack!` macros can be used to quickly save and load data from the file-like interface. Example:

```julia
using UnPack
file = jldopen("example.jld2", "w")
x, y = rand(2)

@pack! file = x, y # equivalent to file["x"] = x; file["y"] = y
@unpack x, y = file # equivalent to x = file["x"]; y = file["y"]
```

The group `file_group = Group(file, "mygroup")` can be accessed with the same file-like interface as the "full" struct.
JLD2 is a package for the [julia programming language](https://julialang.org/) for saving and loading data.
Highlights include:

- Simple API for basic usage: `jldsave(filename; data)` and `load(filename, "data")`
- JLD2 can serialize complex nested structures out of the box.
- JLD2 files adhere to the HDF5 format specification making it compatible with HDF5 tooling
and H5 libraries in other languages. (Can also read HDF5 files.)
- It is fast. JLD2 uses the julia compiler to generate efficient code for serializing complex structures.
- Users may provide custom serialization procedures to control how data gets stored.
- JLD2 provides *upgrade* mechanisms for data structures that need post-processing on load (for example when the julia types have changed)

For details on usage see the [documentation](https://JuliaIO.github.io/JLD2.jl/dev).
1 change: 1 addition & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
[deps]
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
UnPack = "3a884ed6-31ef-47d7-9d2a-63182c4928ed"

[compat]
Documenter = "1"
136 changes: 136 additions & 0 deletions docs/src/basic_usage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
## Reading and writing data
JLD2 provides a few different options to save and load data:

- [FileIO interface](@ref)
- [Single object storage](@ref)
- [File handles](@ref)
- [UnPack Extension](@ref)

### FileIO interface
The `save` and `load` functions, provided by [FileIO](https://github.com/JuliaIO/FileIO.jl), are one of the simplest ways to use `JLD2`.
The `save` function accepts an `AbstractDict` yielding the key/value pairs, where the key is a string representing the name of the dataset and the value represents its contents:

```@repl
save("example.jld2", Dict("hello" => "world", "foo" => :bar))
```

The `save` function can also accept the dataset names and contents as arguments:

```@repl
save("example.jld2", "hello", "world", "foo", :bar)
```

For `save` and `load` to automatically detect that you want to save a JLD2 file use the file suffix `".jld2"`.

If called with a filename argument only, the `load` function loads all datasets from the given file into a `Dict`:

```@repl
load("example.jld2")
```

When called with a single dataset name, `load` returns the contents of that dataset from the file:

```@repl
load("example.jld2", "hello")
```

When called with multiple dataset names, `load` returns the contents of the given datasets as a tuple:

```@repl
load("example.jld2", "foo", "hello")
```

### jldsave

`jldsave` makes use of julia's keyword argument syntax to store files.
This can be useful, when your data variables already have the correct name, e.g. use
`jldsave(file; variablename)` instead of `save(file, "variablename", variablename)

```@docs
jldsave
```

### Single object storage
If only a single object needs to stored and loaded from a file, one can use
`save_object` and `load_object` functions.

```@docs
save_object
load_object
```

### File handles

It is also possible to interact with JLD2 files using a file-like interface. The `jldopen` function accepts a file name and an argument specifying how the file should be opened:

```@docs
jldopen
```

Data can be written to the file using `write(f, "name", data)` or `f["name"] = data`, or read from the file using `read(f, "name")` or `f["name"]`. When you are done with the file, remember to call `close(f)`.

Like `open`, `jldopen` also accepts a function as the first argument, permitting `do`-block syntax:

```@example
jldopen("example.jld2", "w") do f
write(f, "variant1", 1.0)
f["variant2"] = (rand(5), rand(Bool, 3))
end

f = jldopen("example.jld2")
v1 = read(f, "variant1")
v2 = f["variant2"]
close(f)
v1, v2
```

#### Groups
JLD2 files allow for nesting datasets into groups which may be useful for organizing your data.
You may construct groups explicitly:
```@example
jldopen("example.jld2", "w") do file
mygroup = JLD2.Group(file, "mygroup")
mygroup["mystuff"] = 42
display(file)
end
```

or implicitly, by saving a variable with a name containing slashes as path delimiters:
```@example
save("example.jld2", "mygroup/mystuff", 42)
```

Similarly, you can access groups directly:

```@example
jldopen("example.jld2") do file
file["mygroup"]["mystuff"]
end
```

or using slashes as path delimiters:

```@example
load("example.jld2", "mygroup/mystuff")
```

When loading files with nested groups these will be unrolled into paths by default but
yield nested dictionaries but with the `nested` keyword argument.
```@repl
load("example.jld2")
load("example.jld2"; nested=true)
```

### UnPack Extension

When additionally loading the [UnPack.jl](https://github.com/mauro3/UnPack.jl) package, its `@unpack` and `@pack!` macros can be used to quickly save and load data from the file-like interface. Example:

```@example
using UnPack
file = jldopen("example.jld2", "w")
x, y = rand(2)

@pack! file = x, y # equivalent to file["x"] = x; file["y"] = y
@unpack x, y = file # equivalent to x = file["x"]; y = file["y"]
close(file)
```
Loading
Loading