Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg i uses pacman -Sy, which is fundamentally unsafe #137

Open
tstein opened this issue Oct 13, 2024 · 7 comments
Open

pkg i uses pacman -Sy, which is fundamentally unsafe #137

tstein opened this issue Oct 13, 2024 · 7 comments

Comments

@tstein
Copy link

tstein commented Oct 13, 2024

what's the problem

pkg i PACKAGE, in a pacman install, calls pacman -Sy PACKAGE. It is a basic gotcha of Arch that you should never call pacman this way - they call this a "partial system upgrade" and the relevant Arch wiki page urges you to never do this, with big bold text and everything. I think the reasons we have pkg include saving keystrokes and providing a lower friction interface to package management for particularly new-to-linux users, both of which also point strongly away from unsafe defaults.

why's it a problem

IIUC, the reason Arch unambiguously says not to do this is that pacman simply does not check reverse dependencies when installing individual packages. Example:

  • you have installed package browser version 0, which depends on libtls version 0, providing libtls.so.0
  • after your last full upgrade with -Syu, repos pick up libtls version 1, a breaking change that only provides libtls.so.1
  • you, against advice, pacman -Sy libtls
    • pacman will update the repo (-y), see the upgrade for libtls, depsolve its forward dependencies (maybe it needs an update to libcoolneoninstructions)
    • pacman will not consider whether your current installation of browser version 0 will still have its dependencies met after installing libtls version 1, and will happily do the upgrade and leave browser broken
  • (example is a lib, but this is even more insidious and even more likely to bite someone with a package that provides binaries people will run explicitly but also files for other packages. imagemagick comes to mind.)

I think the idea here is that the consistency enforcement is in the repo metadata. pacman repos are supposed to always provide a set of packages whose dependencies are satisfiable within that set. -S libtls would have been safe, because all packages from the same sync are consistent; -Syu is always safe, because that actually considers every package and the repo-level consistency guarantees a single install "transaction"[0] where all dependencies are satisfied before and after it happens.

I am fairly certain that the apt stack does do reverse depsolving on every action and that no comparable problem exists - absent a bug or unrecovered interrupt, apt update && apt install PACKAGE always leaves a consistent system[1]. The operation of "update metadata and install one package" is safely expressible in apt and simply isn't in pacman, by design. I might be wrong there - if so, we should change both.

edit: Talked offline - apt has disappointed me here. termux/termux-app#4179 is this, for the apt stack.

[0] pacman upgrades don't appear to try for ACID. Maybe CID.
[1] Tiny soapbox: I suspect the absence of safety checks here is the dominant reason for pacman's famous speed. I guess it's just better

okay what do we do?

I see four credible options:

  1. Nothing. Despite the unconditional urging against this from pacman's authors, triggering this problem requires a combination of a specific flavor of upgraded package and a specific flavor of user install op, without doing a system upgrade in the middle. It's also potentially recoverable with a system upgrade.

  2. Change pkg i to use pacman -S. This preserves intent, but the version this selects may no longer be in the repos, and it produces a painful split from a support perspective: if you tell someone to pkg i something today and they report success, you can safely assume it's pretty recent. You'd need to ask if they're on apt or pacman to be able to do that, or you'd need to skip it and only tell people to pacman -Syu THINGISAIDTOINSTALL instead of using pkg at all, which undermines the value of pkg.

  3. Change pkg i to use pacman -Syu. This potentially adds a lot of additional work to what would otherwise be a small download and install, which may surprise some users and be an actual problem for others, but gets full marks for doing what the user asked and for doing it safely.

  4. Do both, with a safe default, a noisy preservation of the unsafe behavior for existing installs, and an opt-out:

  • choose some envvar like TERMUX_PKG_PACMAN_UNSAFE
  • if unset, print a warning with instructions, then -Sy
  • if set to a falsy value, -Syu
  • if set to a truthy value, -Sy with no warning
  • change pacman setup scripts/instructions to set TERMUX_PKG_PACMAN_UNSAFE=false in profile.d or something, and maybe explain that this choice was made and how to reverse it

Opening this issue for opinions. If we decide to take anything like one of those options, I'll send the PR. My vote is for option 4.

@tstein
Copy link
Author

tstein commented Oct 13, 2024

cc @Maxython @TomJo2000

@TomJo2000
Copy link
Member

TomJo2000 commented Oct 13, 2024

The problem description and step through are clear and accurate in my opinion I'd just like to add a little bit of additional context to a couple points.

IIUC, the reason Arch unambiguously says not to do this is that pacman simply does not check reverse dependencies when installing individual packages.

The consistency of pacman's package database is placed on the remote repository side.
As opposed to apt which does additional consistency checks when running a package installation.
The sync database on the remote pacman repository is trusted to be in a consistent state,
and any installation from a consistent state must thus be consistent.

The core issue is that by using pacman -Sy the local package database is synced to the remote one,
but the local packages are not simultaneously updated to their remote state.
Thus the local package state becomes out of sync with the state for which the remote sync database is trusted to be consistent.
Thus installing packages based on the updated remote state on top of an out of sync local package state may lead to inconsistencies, meaning one or more broken packages.

This is more than likely not an unrecoverable issue and should most of the time be able to be resolved by performing a full system upgrade using pacman -Syu.
Thus resyncing the local package states to be consistent with the remote package database.

Tiny soapbox: I suspect the absence of safety checks here is the dominant reason for pacman's famous speed

I take issue with the implication that pacman simply does not perform consistency checks.
It does, they're just handled on the remote repository side and the repositories package database is trusted to be consistent.

Okay what do we do?

As you mentioned "Option 1", which is the status quo, isn't probable to be unrecoverable,
and requires rare circumstances to occur.
It would still be preferable to avoid the issue entirely.

"Option 3" would avoid the problem by always doing a full system upgrade (pacman -Syu) with every pkg i.
This has the potentially undesirable side effects of taking longer, consuming additional data, and diverging from the behavior exhibited by the code path for pkg i when using apt.

"Option 2" preserves parity with the apt behavior of pkg i in a safe way by attempting to install packages based on the consistent local package database state.
This has the downside of potentially failing if the state of the package (or one of its dependencies) in the outdated local package database is inconsistent with the state of the package (or one of its dependencies) on the remote repository.

I think "Option 4" presents a good compromise between preserving operational parity with apt and mitigating the possibility of a failure to install packages.
Something along the lines of pacman -S "$@" || pacman -Syu "$@" will install desired packages standalone when possible, and automatically fall back to doing a full system upgrade if necessary.
An env var to allow reverting back to the current, unsafe, behavior (pacman -Sy) also seems like a good cautionary measure to include.
As two wise men once said.

and

  • With a sufficient number of users of an API,
    it does not matter what you promise in the contract:
    all observable behaviors of your system
    will be depended on by somebody. - Hyrum's Law

@sylirre
Copy link
Member

sylirre commented Oct 16, 2024

It follows behavior used for apt. Update index and then install given packages.

Yes, this is not safe for rolling release distributions and we sometimes get bug reports. Here is the most recent one: termux/termux-app#4179

May suggest changing install methods to

pacman -Syu; pacman -S --needed "$@"
apt update; apt full-upgrade; apt install "$@"

Removal of index update discards the primary purpose of pkg.

@TomJo2000
Copy link
Member

May suggest changing install methods to

pacman -Syu; pacman -S --needed "$@"
apt update; apt full-upgrade; apt install "$@"

Removal of index update discards the primary purpose of pkg.

As outlined in "Option 3", this will impact data transfer amounts which may be undesirable to users.
It's the "correct" solution but not necessarily the best one for user needs or expectations.

@sylirre
Copy link
Member

sylirre commented Oct 16, 2024

@TomJo2000 The possible choices are:

  1. Get rid of pkg and let user decide how to use package manager.
  2. Let pkg give users what they want but mask all tech tasks in the background (refresh index, ensure updates, switch mirror, etc). That's for what pkg was created in the first place.

If our choice (not one of 1 or 2 above) is to reduce pkg purpose to just selecting mirrors, then won't be better to do mirror selection just on the first Termux run? - related: termux/termux-app#4096

@TomJo2000
Copy link
Member

I don't think he have to be that fatalistic about it.
The main goal in this instance is trying to reduce random package breakages.

@Maxython
Copy link
Member

I propose the following solution: start checking in pkg the relevance of dependencies and required packages of the package that will be installed/updated, and update them if necessary. I think this is the perfect middle ground between minimizing system breakage due to out-of-date packages, and not doing a global update of all packages. I don't know how to implement this with apt, but with pacman I have an idea of ​​the scheme of this check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants