-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkgs is a mess #107539
Comments
This issue has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/nixpkgs-has-been-the-largest-repository-for-months/10667/1 |
I'd like to get rid of `pkgs.AAAAAASomeThingsFailToEvaluate`. Everything under `pkgs` should at least eval so far to be type-checkable. Completely broken or highly experimental stuff belongs elsewhere IMO.
This requires a ton of clarification. You will still not be able to nix-instantiate Nixpkgs in the current sense, because on any given platform _some_ packages fail to evaluate if only because they are for a different platform. That's before unfree and «insecure but fine for PDFs you have built with LaTeX» packages.
|
I think I might be missing something, but if everything that's not a package is going to be overlaid back into pkgs, what's the point of adding a layer of indirection? I would still want to access non-packages when importing nixpkgs directly. For example: { pkgs ? import <nixpkgs> {} }:
pkgs.lib EDIT: Would the set that only contains packages be available through |
They fail in a well behaved manner that even print a helpful message telling me why it can't eval. I can also still type check it (e.g. Contrast this to things I can't even begin to touch with This was also more of a "stretch goal", something I'd like to see on top of the other proposed changes. Should've worded that clearer.
That there is a trivial way to make an attrset of just packages without all the other stuff. Currently, there is no way to know whether something under pkgs is a package, subset, function or can even be evaluated. Instead of only having a messy set, I'd like a bunch of clean sets with clear purposes and boundaries between them. For compatibility and convenience sake, I'd then like to re-create the messy set by combining the clean ones. What we gain by that is having these clean sets and using them to organise our huge amount of packages moving forward so that we don't go back to having such a mess.
Good point. Things like lib or other entry points to non-package sets should remain. They should be declared outside the "real" pkgs set though of course and only "overlaid" for convenience.
I don't think that'd work as pkgs is nixpkgs itself I think and you need nixpkgs to have a Have a look at |
> You will still not be able to nix-instantiate Nixpkgs in the current sense, because on any given platform _some_ packages fail to evaluate if only because they are for a different platform.
They fail in a well behaved manner that even print a helpful message telling me why it can't eval.
Note that the attribute is added specifically because the helpful message is super unhelpful when you evaluate the entire nixpkgs because of messing up arguments and have no idea why some package would even be evaluated. The intentional failure actually mentions the idea of entire Nixpkgs being evaluated, so it is useful.
What we gain by that is having these clean sets and using them to organise our huge amount of packages moving forward so that we don't go back to having such a mess.
That sounds pretty circular…
|
I support the idea of having a cleaner structured interface like that. |
We could add an A goal I'd like to reach in the process of that because its somewhat related is to make nixpkgs fully evaluable with only a few well-defined exceptions that you simply exclude but that can come later. Should've been clearer on that.
Very good point! I only learned about such things because I had some closer looks at the pkgs set while adding packages or working on packages.json but I can't imagine non-Nixpkgs contributors ever doing that. |
We could add an `AAAAfullEvaluationWarning = builtins.trace "Warning: You're evaluating the full Nixpkgs! This can take a lot of resources and time!." { };` to "catch" cases like that but I'd rather come to edge cases like these later and first lay out a plan on how we should even begin to improve things.
Printing a message that is only to be shown when you are doing a thing that will eventually fail (unfree/wrong platform will not go away) and notaborting a resource intensive process that is known to fail anyway sounds like a weird choice.
A goal I'd like to reach in the process of that because its somewhat related is to make nixpkgs fully evaluable with only a few well-defined exceptions that you *simply* exclude but that can come later. Should've been clearer on that.
A simple well-defined exception of darwin?
Is what you want to achieve making sure that «incorrect arguments» asserts/throws are never triggered by the provided attributes, and the special case like horrible licensing of SDKs behave closer to broken/unfree?
Very good point! I only learned about such things because I had some closer looks at the pkgs set while adding packages or working on packages.json but I can't imagine non-Nixpkgs contributors ever doing that.
Note that a ton of such wrappers need to be maintained together with some package, and also note that some wrappers are just packages taking the unwrapped thing as an input (of course there are overrides if necessary).
So is your first step to find what should be in pkgs-lib?
|
The problem isn't that it's doing that, it's how. It's a dumb abort as soon as you try to touch it. Packages that aren't available for your platform, are unfree, broken or have some other well-defined defect don't behave like that.
I think you're misunderstanding me, I don't want everything under pkgs to be a derivation that can be instantiated or even built, I want to get pkgs into a state where I can work with its attributes without Quote:
Something simple like
Why would that be an issue? Wrappers/generators go to their respective sets and the packages set can then freely use them to make actual packages.
I think a good first step would be to define what categories of user-visible things there are in nixpkgs. What I can think of on the top of my head:
What else is there? After we've done that we think about how to integrate them into nixpkgs data structure, move around all the things that are in the wrong place and then throw them together to get back a superset of the current pkgs. |
> when you are doing a thing that will eventually fail (unfree/wrong platform will not go away)
The problem isn't *that* it's doing that, it's *how*. It's a dumb abort as soon as you try to touch it.
Packages that aren't available for your platform, are unfree, broken or have some other well-defined defect don't behave like that.
We still have some packages that behave like that (at least in generated package sets where for some packages some combination doesn't make sense).
I think as long as we have those it is fine for the fail-early attribute to do the immediate abort; if it ends up the only evaluation error in a typical Hydra evaluation, I agree it should be converted to fail like a broken package (maybe combining all the possible types of failure).
> A simple well-defined exception of darwin?
I think you're misunderstanding me, I don't want everything under pkgs to be a derivation that can be instantiated or even built, I want to get pkgs into a state where I can work with its attributes without `abort`ing unexpectedly (or worse).
Quote:
>> Everything under pkgs should at least eval so far to be type-checkable.
Something simple like `filterAttrs (n: v: !v.meta.broken) pkgs` should be possible in a well-behaved set of pkgs IMO.
Arguably, filtering for being a function and filtering or recursion for being a sub-package-set would be also allowed…
> Note that a ton of such wrappers need to be maintained together with some package
Why would that be an issue?
Well, making change in two places is easier than in three and all that.
What I can think of on the top of my head:
* constants: just values (e.g. lib.trivial.release)
* pkgs: constants that can be instantiated to a single drv when not defect and have some metadata (pname, version, meta)
* functions: lambdas with a name
* generators: functions that eval to a pkg (or set of pkgs?) given some arguments (e.g. pkgs.runCommand, fetchers)
* wrappers: generators that take a pkg and return a new pkg that extends the old one
* sets of all the above
Here functions overlap with generators and wrappers overlap half with generators and half with pkgs, right?
…What else is there?
After we've done that we think about how to integrate them into nixpkgs data structure, move around all the things that are in the wrong place and then throw them together to get back a superset of the current pkgs.
|
That is precisely what I'd like to see changed in that "stretch goal". Quote:
If those packages didn't exist or were confined to well-defined places, we wouldn't need
Assuming we can type-check all attrs, yes, that'd work. It wouldn't be particularly tidy and more complicated than it needs to be though and I'm not a fan of that.
How would you find out whether something is a set of packages? Typecheck recursively? Also, some sets in pkgs contain packages, wrappers, generators and even functions at the same time. What kind of set are those?
The wave function collapsing inside my flash drive also overlaps with "it stores cat pictures" but that doesn't make it a helpful description. |
> We still have some packages that behave like that
> I think as long as we have those it is fine for the fail-early attribute to do the immediate abort;
That is precisely what I'd like to see changed in that "stretch goal". Quote:
>> Completely broken or highly experimental stuff belongs elsewhere IMO.
If those packages didn't exist or were confined to well-defined places, we wouldn't need `AAAAAASomeThingsFailToEvaluate` in the main `pkgs` set.
Nope, the _packages_ are neither completely broken not experimental. They work just fine when imported into a different package subset. Their inclusion in the wrong subset is kind of broken, but well, it is the simplest way to manage language ecosystems for multiple language versions.
If you think precondition violation should be done via meta.broken and not via assertions, there is another old issue (without much consensus or traction).
> filtering for being a function
Assuming we can type-check all attrs, yes, that'd *work*.
It wouldn't be particularly tidy and more complicated than it needs to be though and I'm not a fan of that.
The problem is: even the part of your proposal that is clearly a useful uniformisation, and would need only localised changes, has somehow failed to gain traction previously. For the large overhauls the only benefit described is that it would look subjectively cleaner. Peronslaly, I am weakly in favour of reviving the assert-avoidance proposal, and in weakly favour of adding metadata for subsets, but opposed to splitting in Nixpkgs what can be just filtered by an external function if the cleanups succeed.
And if even the uniformisation/metadata proposals fail, then the split is not enough anyway.
> filtering or recursion for being a sub-package-set
How would you find out whether something is a set of packages? Typecheck recursively?
That'd fail at `pkgs.broken.AAAAAASomeThingsFailToEvaluate`.
Also, some sets in pkgs contain packages, wrappers, generators and even functions at the same time. What kind of set are those?
> Here functions overlap with generators and wrappers overlap half with generators and half with pkgs, right?
The wave function collapsing inside my flash drive also overlaps with "it stores cat pictures" but that doesn't make it a helpful description.
I think a classification should include some discussion of overlaps of the classes listed on equal footing. If it does not, well, is there any evidence it is a useful classification?
|
Sounds good to me, minus the subsets attribute. Annoying bit is to keep attributes in the package set and subsets set aligned. As an alternative, it is possible to just group the sets together at for example the bottom of the file. Given the size of the file, I don't mind having it separate though. I can see that in the future we want to have this in a separate set anyway because of the splicing of sub package sets (won't go into details here).
Pure functions, yes. Functions that build derivations, no. Regarding grouping of items within subsets, that maybe should be part of NixOS/rfcs#83. |
We should also have a |
Related: #7866, #8801, #39169 (comment) (somewhat), #39561 |
Most of these are generated package sets pulled from some external repository. They should all be migrated to separate repositories, perhaps in Obviously this would need to wait until flakes are in Nix stable. |
C.f. the emacs packagesets |
> Package subsets (i.e. pythonPackages, haskellPackages)
Most of these are generated package sets pulled from some external repository. They should all be migrated to separate repositories,
Separated of corresponding language implementations, to maximise the fun of coordinating CI.
perhaps in `nix-community`, and flake-ified. CI can then automate updates without requiring manual review.
Except for every language ecosystem I have heard about that applies various fix-ups manually on every large update
…Obviously this would need to wait until flakes are in Nix stable.
|
Can you elaborate? We're talking about package repositories; presumably the inputs to these flakes would be the same language implementations we have currently in nixpkgs. The main coordination point would be test suites to ensure updates don't occur when a representative sample of packages fail to build.
Can you provide examples? |
> Separated of corresponding language implementations, to maximise the fun of coordinating CI.
Can you elaborate? We're talking about package repositories; presumably the inputs to these flakes would be the same language implementations we have currently in nixpkgs. The main coordination point would be test suites to ensure updates don't occur when a representative sample of packages fail to build.
So now to update Python you need to make the PythonPackages flake unbuildable without locking, then separately fix it?
> Except for every language ecosystem I have heard about that applies various fix-ups manually on every large update
Can you provide examples?
Haskell ecosystem has a ton of overrides. Common Lisp ecosystem has a ton of overrides (relative to the number of included packages). Python ecosystem seems to be mainly carefully picking versions to be able to have a single preferred version in most cases (which is not always the latest at PyPI, obviously).
Then there are things (from many ecosystems) that are dependencies of Sage, and not used much for anything else. For these, the updates often wait for upstream Sage support.
|
That is more or less the status quo, is it not? Except we just skip the "run bespoke shell script in nixpkgs" step to generate Nix expressions. I wouldn't suggest committing changes which don't pass tests.
These are fair points. Again, I don't think this is any different from the status quo, except we are reacting to changes more quickly when we can automate updating and reporting failures. |
Worth remembering that everything in flakes is locked, so there is a very clear and expressive way to show which configurations are "tested and known to work", so e.g. Python updates wouldn't make it into a pythonpackages flake until updated. Hence with that and what @tadfisher pointed out, I don't think that's a real issue |
> So now to update Python you need to make the PythonPackages flake unbuildable without locking, then separately fix it?
That is more or less the status quo, is it not?
Except now you have the option to merge both steps as a single merge.
Except we just skip the "run bespoke shell script in nixpkgs" step to generate Nix expressions.
So you just want to move the scripts to updateScript properly? Because that part does not need flakes.
I wouldn't suggest committing changes which don't pass tests.
First, a lot of time it's actually the correct decision.
Second, you are instead suggesting to move the most natural tests for the Python interpreter, Python packages, away.
> Haskell ecosystem has a ton of overrides. Common Lisp ecosystem has a ton of overrides (relative to the number of included packages). Python ecosystem seems to be mainly carefully picking versions to be able to have a single preferred version in most cases (which is not always the latest at PyPI, obviously).
These are fair points. Again, I don't think this is any different from the status quo, except we are reacting to changes more quickly when we can automate updating and reporting failures.
I believe by now nobody objects to have an auto-merge bot in Nixpkgs that would let people who are publically in the committers team to «merge once checks pass». It is also clearly an improvement, and a foundation for further merge improvements, such as giving maintainers merge to their packages before full commit access. Many people are in favour of both (me included). So that would be an incremental improvement with wide support and definitely not breaking anything. It seems that merge bots are a pain to configure, because some people have apparently started trying to do something about that, but nobody has ever succeeded or reached a point where the problem would be to attract attention of people with access to apply the proposed configuration to Nixpkgs.
I do not believe that splitting into flakes would make configuring the merge bot so much easier that the issue will disappear. The immediate cost of the change and more complicated CI situation will be there. The better infrastructure will not appear, because many of the ecosystems are too large for a full test run just with GitHub Actions as CI.
|
There is no reason why the staging step cannot happen in the same repository, instead of flake. In fact it already does – peti updates stuff on |
Worth remembering that everything in flakes is locked, so there is a very clear and expressive way to show which configurations are "tested and known to work", so e.g. Python updates wouldn't make it into a pythonpackages flake until updated. Hence with that and what @tadfisher pointed out, I don't think that's a real issue
Locks do not make it any less annoying that you cannot easily use Nixpkgs input with the latest glibc bugfix without fixing unrelated compatibility issues first…
They just put on record what versions are supposed to be used together, sure.
(And if we want to bump the version locks all the time, we could just do what people have been asking for since forever and have more versions of stuff in Nixpkgs and bump the dependencies as we check the bump is safe…)
|
Also the language ecosystems are not completely independent – there are core packages (e.g. libinput) that depend on many python modules so the split would introduce a cycle.
I have no idea how to go about flattening. |
@jtojnar How do you propose flattening the |
I'd highly appreciate if we could get back on topic here; splitting Nixpkgs wouldn't change a thing about the inability to know whether an attr of Not really, these are the organisation of real, working packages in all-packages.nix and the Nixpkgs directory layout, not all-packages.nix is also in dire need of a cleanup but a fix for this issue would only touch it so far as in that non-package declarations may be bundled together and/or moved to other files.
I'm not sure what you mean with that. I'd like to remove the need of needing to keep anything aligned by declaring them in one single point of truth. I.e. the only way to add a subset could be to make add an attr in subsets.nix and Nixpkgs automatically combines that attrset with all-packages to make the
I'm not sure what kind of packages you're talking about here (an example would help). I haven't come across anything outrageous yet.
Again, not sure which packages you're talking about but if the information on which language version a package works exists, it could be used to mark those packages as broken or even exclude them from the list entirely.
That'd be ideal.
I don't care how many issues about (somewhat) related problems there have been and how much traction they have gotten in the past, this is an issue which still exists and has now shown first signs of needing to get fixed sooner rather than later.
The problem is: While you may be able to filter it with an external function at some point (I was after deleting a few attrs), it's still not possible to know where you should look for more packages and where not. This is mostly about discoverability To get a list of working packages in Nixpkgs, you currently need to:
This is what's required to get a list in just one dimension however and as we all know: Nixpkgs is a tree, not a list. Even if we could type-check for Obviating the need for:
The subset issue can only be solved with an authorative list of subsets and where to find them in my eyes.
AFAICT there is no overlap other than that some categories are more specific versions of others. I had assumed that's what you meant by "overlap".
None whatsoever. If you can provide a better one; feel free. As I said, those were just some suggestions off the top of my head. I don't actually care all that much about how they're categorised or what they're called, what I really is want a clear distinction between packages and things that aren't packages so that we can sort |
Like Python packages which only work for PyPy or only for Python3 (note that PyPy for Python2 language is still supported upstream) or something like that. And having multiple subsets with Python packages for different implementations of Python built form the same individual-package expressions — also with an option to pass an expression the Python package set and quickly pick to what Python it corresponds — is a feature, not a bug.
Oh well, then you will a few months later when this still has a lively discussion and goes nowhere in practice.
An issue, singular? I think you combine multiple complaints, and I prefer the current state to the proposed alternative re: subsets.
So, what I see as definite cleanups that could be done in parallel and that I would support:
Then what you want re: package enumeration should become possible to provide as a library function. But then you will probably need to push ahead specific narrow kinds of changes so that you do not get stuck designing a huge treewide overhaul where even the plan will need a table of contents.
Hm, you are right, all overlap cases are inclusion / «more-specific-version».
Is there any evidence there is a classification that is actually useful?
Which is not really well-specified, not useful to most, and definitely not worth giving up on something actually valuable, like subsets. |
This seems to me as though it should have been the case from the start |
> * convert everything to use meta.broken instead of assertions, at least on everything reachable from Nixpkgs
This seems to me as though it should have been the case from the start
At the real start, support for meta.broken was considered too much computation to be done in Nix. And I think some people say there is missing platforms (the package is not _supposed_ to work at some platforms), broken (the package could probably be buildable here but we failed to achieve that), and assertions (it is a user error to request such combination of options — indeed, Nixpkgs should not request such combinations).
And I guess most people believe that this distinction is not worth the effort. It kind of isn't, indeed. Probably a reasonably sized team could just force the issue and clean things up so that everything reachable through attributes in Nixpkgs is marked unusable via meta and not assert/throw. Maybe the process would lead to a cleaner and also documented distinction between the cases (which I would assume that team would consider a plus, and presumably a majority of contributors would consider a thing which is in principle nice to have but not worth the effort but if there are people who volunteered to do the work that's nice)
Note that size of the team is needed not only to do the work, but also to demonstrate it is a niche but existing preference, and also that a significant number of people managed to get together and build a coherent vision of what you want.
|
I'm closing this because the issue by itself is not actionable and the conversation has ended. |
The mess
There are a ton of packages which is fine of course (that's what it's for) but there are also a lot of things that aren't packages. Most notably:
Keeping track of package subsets has already caused a few headaches (see #102508) and this also just doesn't feel right. We have the power to be declarative and neat thanks to our functional idioms, we can do better.
Fixing it
I was thinking of pulling subsets out of all-packages, into a new attrset and then "overlaying" it back on top of pkgs to get the same pkgs set we have right now (plus maybe a "subsets" attribute).
Functions should probably get the same treatment or maybe even moved to
lib
.Some things I'd like
I'd like to get rid of
pkgs.AAAAAASomeThingsFailToEvaluate
. Everything underpkgs
should at least eval so far to be type-checkable. Completely broken or highly experimental stuff belongs elsewhere IMO.cc @garbas
The text was updated successfully, but these errors were encountered: