Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make precompile files relocatable #49866

Merged
merged 21 commits into from
Oct 20, 2023
Merged

Conversation

fatteneder
Copy link
Member

@fatteneder fatteneder commented May 17, 2023

Fixes #45215
Fixes #47943
Fixes #45541
Fixes #50918


Updated description

Hi.

I tried my luck with #47943 and I implemented what was suggested by OP.

String replacement with @depot when serializing out happens with any paths that are located inside a DEPOT_PATH (first match wins). If no match, then we emit the absolute file path as before. Right now we only emit one token @depot.

String replacement of @depot when loading happens now on a .ji file basis and only if all the listed include dependencies can be resolved to files located in one and the same depot on DEPOT_PATH (again, first match wins). If we can't resolve, then the cache is invalided with stale_cachefile.

This PR does not explicitly address any of #4630 or #33065 some suggested would be nice to fix in one swipe with #47943. It remains to check if this was accidentally solved.

I also did not understand what was meant in #47943 (comment) with the deps/build.jl and artifacts issue and how that should be addressed here.

Status of this PR

The following works for me locally,

  • build julia, run it and compile a few packages
$ cd <git-root-julia>
$ export JULIA_DEPOT_PATH="$(pwd)/usr/share/julia"
$ ./julia
julia> using Pkg; Pkg.add("UnPack")
julia> using UnPack
  • exit julia, copy the depot (<git-root>/usr/share/julia/) somewhere else, set JULIA_DEPOT_PATH accordingly, e.g.
$ cp usr/share/julia /tmp
$ export JULIA_DEPOT_PATH="/tmp/julia"
  • restart julia and verify the package can be loaded without compilation by running
$ ./julia
pkg = Base.identify_package("UnPack")
path = Base.locate_package(pkg)
iscached = @lock Base.require_lock begin
    m = Base._require_search_from_serialized(pkg, path, UInt128(0))
    m isa Module
end
iscached == true # can load from cache

Open questions

From docs:

The first entry is the "user depot" and should be writable by and owned by the
current user. The user depot is where: registries are cloned, new package versions
are installed, named environments are created and updated, package repos are cloned,
newly compiled package image files are saved, log files are written, development
packages are checked out by default, and global configuration data is saved. Later
entries in the depot path are treated as read-only and are appropriate for
registries, packages, etc. installed and managed by system administrators.
  • Does ... are treated as read-only ... guarantee that Julia will never write anything to a path other than DEPOT_PATH[1]?
  • [Need to test this myself] If we load packages from any of the read-only paths, their precompiles also end up in DEPOT_PATH[1]/compiled?

If both questions can be answered with yes, then I guess its save to replace @depot always with DEPOT_PATH[1] when loading. The path for @stdlib is replaced with Sys.STDLIB, because that seems to be tied to where the julia binary sits.

I think I could work around all of that by following more closely what was proposed in #47943 (comment):

If we have a package Foo.jl which has been precompiled, that cache should reference the source files as something like @depot/packages/Foo_jll/a2b2c3/src/Foo.jl. During cache header verification, we string-replace ^@depot to the same depot that the cache file is currently being loaded from. I do not think it's worthwhile to allow this path to resolve to other depots on the depot path; it's too easy to find the "wrong" files, and I think it's reasonable to expect an atomic move of all associated resources for a package image.

See top of this post on how it is implemented now.

TODO

  • Get basic functionality working.
  • Write basic test.
  • PkgEval -- first run looked good.
  • Fix depot relocation logic.
  • Write more tests.

I would appreciate any feedback!

@fatteneder
Copy link
Member Author

I guess I did miss something, because the first test is already failing with
LoadError("sysimg.jl", 20, ErrorException("failed to identify a depot path: '/cache/build/default-armageddon-0/julialang/julia-master/usr/share/julia/stdlib/v1.10/ArgTools/src/ArgTools.jl'"))

@fatteneder fatteneder force-pushed the fa/relocate-v2 branch 3 times, most recently from ab2fa3b to 25ae086 Compare May 18, 2023 07:23
@fatteneder fatteneder changed the title Make precompile files relocatable WIP: Make precompile files relocatable May 18, 2023
@fatteneder
Copy link
Member Author

This needs more work because it breaks this MWE

# from test/compiler/contextual.jl

load_path = mktempdir()
depot_path = mktempdir()

prev_DEPOT_PATH = deepcopy(DEPOT_PATH)
prev_LOAD_PATH = deepcopy(LOAD_PATH)

pushfirst!(LOAD_PATH, load_path)
pushfirst!(DEPOT_PATH, depot_path)

write(joinpath(load_path, "Foo.jl"),
    """
    module Foo
    end
    """)

try
    Foo = Base.require(Main, :Foo)
finally
    copy!(DEPOT_PATH, prev_DEPOT_PATH)
    copy!(LOAD_PATH, prev_LOAD_PATH)
end

@fatteneder fatteneder force-pushed the fa/relocate-v2 branch 4 times, most recently from be58fca to 0a5ab9e Compare May 19, 2023 21:52
@DilumAluthge
Copy link
Member

This is very exciting. Thanks for working on this!

@fatteneder
Copy link
Member Author

fatteneder commented May 20, 2023

Regarding builds & tests:

I see random (?) build failures (e.g. curl: Operation too slow) I think are unrelated to this PR, since builds passed on all platforms at least yesterday.

There are some tests failing too where I don't know what is causing that.

  • On Linux there are (random ?) LinearAlgebra tests failing, although they pass when run locally.
  • Mac and Windows tests fail with ArgumentError: Invalid checksum in cache file (on Mac it happens in test/threads.jl). I don't have a Mac to debug but can setup a VM to test on Windows.

@fatteneder fatteneder force-pushed the fa/relocate-v2 branch 2 times, most recently from d081c65 to cdd5ab9 Compare May 20, 2023 19:58
@vtjnash vtjnash requested a review from KristofferC May 20, 2023 20:01
@vtjnash
Copy link
Member

vtjnash commented May 20, 2023

Some CI tests / machines are flaky, and those seem like they may be some of them. I think this is simple enough that I say we just go for it? We might have other issues still (like stacktraces and Strings in the source code that accidentally capture @__DIR__) but we can address those later.

Maybe @KristofferC can help make sure this gets merged?

@fatteneder
Copy link
Member Author

fatteneder commented May 20, 2023

I hacked together a MWE for a unit test.
IIUC its not possible to unload modules, so I thought I just call a test twice with some env vars set:
Once to precompile something and verify that its there. Then copy the whole depot into the test folder. And run a second time with the JULIA_DEPOT_PATH adjusted to verify that we can load again and that no recompilation was needed.

But I think it is not yet doing the right thing ...

@fatteneder fatteneder force-pushed the fa/relocate-v2 branch 3 times, most recently from d0e93a0 to 88448ca Compare May 21, 2023 21:32
@fatteneder
Copy link
Member Author

Regarding test failures:

  • The Windows and Mac ones are not random, so there is still something wrong in the PR.
  • I tried to debug Windows with a mingw build, but there I could run selected tests like test/compiler and they passed. The CI stack traces did not give me a clue what to look for ...
  • The test I added fails on the local mingw because of problems with cp.

@fatteneder
Copy link
Member Author

I could identify and fix one more problem related to stale_cachefile:
Moving a depot causes the mtimes of package code to change and become out of sync with the times written in the .ji files.
To work around this I extended the .ji files to also contain the file size and hash next to the mtime of the sources they are pointing to as suggested here: #45541. The file size is added to have a fail-fast branch.

@IanButterworth
Copy link
Member

I’ve not reviewed this yet but I just wanted to say thanks. It would be great to have this.

@IanButterworth
Copy link
Member

IanButterworth commented Aug 2, 2023

Looks like this was hitting repeated precompilation on windows. This Pkg test was failing

https://github.com/JuliaLang/Pkg.jl/blob/master/test/api.jl#L339

If you want to investigate why the caches are being rejected you can set JULIA_DEBUG=loading

@vchuravy
Copy link
Member

Let's do one more PkgEval and then merge this!

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@vchuravy vchuravy merged commit f2df1b4 into JuliaLang:master Oct 20, 2023
7 checks passed
@IanButterworth
Copy link
Member

🎉🎉🎉

@DilumAluthge
Copy link
Member

This is really exciting. Thank you so much @fatteneder!

And thanks to @vchuravy, @staticfloat, @IanButterworth, @KristofferC, and anyone else that I missed that helped with reviewing this PR.

This PR crosses off two things (#45215 and #45541) off of my Julia wishlist.

@DilumAluthge DilumAluthge added packages Package management and loading feature Indicates new feature / enhancement requests labels Oct 20, 2023
@DilumAluthge
Copy link
Member

@fatteneder @vchuravy Could we add this to NEWS?

@fatteneder
Copy link
Member Author

Thanks everyone for the assistance!

@Keno
Copy link
Member

Keno commented Oct 25, 2023

This broke Revise, which is unfortunate. There's a PR open to fix it, but it'd have been nice to merge that first, since Revise is pretty essential for debugging for those of us that live on master.

fredrikekre added a commit to timholy/Revise.jl that referenced this pull request Nov 14, 2023
This patch restores the filtering of non-src files that was lost
in #767. After JuliaLang/julia#49866,
`Base.parse_cache_header` returns both the non-filtered and filtered
list instead of doing filtering internally based on the
`srcfiles_only::Bool` keyword argument.

There is a test for this failure mode that fails on Revise master branch
and passes with this patch (see
https://github.com/timholy/Revise.jl/blob/1059181bed06387e9fbcea137dce28a80c5c45d9/test/runtests.jl#L3070-L3080).
KristofferC pushed a commit to timholy/Revise.jl that referenced this pull request Nov 14, 2023
* Restore filtering of non-src files

This patch restores the filtering of non-src files that was lost
in #767. After JuliaLang/julia#49866,
`Base.parse_cache_header` returns both the non-filtered and filtered
list instead of doing filtering internally based on the
`srcfiles_only::Bool` keyword argument.

There is a test for this failure mode that fails on Revise master branch
and passes with this patch (see
https://github.com/timholy/Revise.jl/blob/1059181bed06387e9fbcea137dce28a80c5c45d9/test/runtests.jl#L3070-L3080).

* Set version to 3.5.9.
mkitti added a commit to mkitti/PkgCacheInspector.jl that referenced this pull request Nov 21, 2023
parse_cache_header now takes a second string argument after an IO

JuliaLang/julia#49866
staticfloat pushed a commit that referenced this pull request Jan 8, 2024
…erent depots (#52750)

Fixes #52161

#52161 is an awkward use of the relocation feature in the sense that it
attempts to load the `include()` and `include_dependency` files of a pkg
from two separate depots.
The problem there is that the value with which we replace the `@depot`
tag for both `include()` and `include_dependency()` files is determined
by trying to relocate only the `include()` files. We then end up not
finding the `include_dependency()` files.

Solution:
@staticfloat noted that the pkg slugs in depot paths like
`@depot/packages/Foo/1a2b3c/src/Foo.jl` are already enough to (weakly)
content-address a depot.
This means that we should be able to load any `include()` file of a pkg
from any `depot` that contains a precompile cache, provided the hashes
match.
The same logic can be extended to `include_dependency()` files, which
this PR does.

Note that we continue to use only one file from the `include()` files to
determine the depot which we use to relocate all `include()` files.
[this behavior is kept from master]
But for `include_dependency()` files we allow each file to resolve to a
different depot. This way the MWE given in #52161 should be extendable
to two README files being located in two different pkgs that lie in two
different depots.

---

Side note: #49866 started with explicitly verifying that all `include()`
files come from the same depot. In #52346 this was already relaxed to
pick the first depot for which any `include()` file can be resolved to.
This works, because if any other file might be missing from that depot
then this is caught in `stalecache()`.
KristofferC pushed a commit that referenced this pull request Feb 8, 2024
…51798)

Continuation of #49866.
Fixes #52462 

So far any `include_dependency` was tracked by `mtime`.
A package using `include_dependency` can't be relocated without
recompilation if the dependency also needs to be relocated.

With `include_dependency(path, track_content=true)` the tracking works
like for `include`,
i.e. recompilation is only triggered when the file's content changes.
In case `path` is a directory we use the string `join(readdir(path))` as
content.

---------

Co-authored-by: Valentin Churavy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:precompilation Precompilation of modules feature Indicates new feature / enhancement requests modules packages Package management and loading
Projects
None yet