-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bounding issues/diagnostics #237
Comments
These are an instructive set of plots and a useful set of tests. Quick comments (in order of points):
|
|
Would you envision something like that being run at various checkpoints during the sampling or just at the end? I'm trying to think of exactly how the results could be communicated to the user in an informative way, especially since the feedback I've gotten is the current set of outputs for the progressbar (and the save quantities in the output results) is already pretty large. If the results could inform the bounding distributions too, that would be a dream end goal. On a sidenote: there is a bootstrap procedure enabled to try and enlarge the ellipsoids, but it's disabled by default because I found it to be unstable in many cases. I could easily see
Yes, exactly. That's at least how most standard implementations work (including the one here), anyways. I'm happy to add larger disclaimers throughout the docs to try and make this point more explicit. Would that help?
Ah yes, gotcha. Maybe this just means the defaults should be even more conservative than what they already are to try and further mitigate this problem. |
I think I would do it at the end. Sure it's nice to report things during sampling, but I don't think it's necessary. (TBH I don't think I know what every number in dynesty's status bar is ; and I think the only numbers that I check there are time, dlogz and logl )
Yes, I saw those parameters, but never experimented with them.
|
Trying bootstrapping with all the changes from my tynifix branch, I see the following problems.
|
The results from 1 and 2 here are why they ended up being disabled by default, since they require a huge number of live points to avoid becoming overly enlarged in higher-dimensions. As for 3, that's true in the sense that you're guaranteed to miss parameter space but not necessarily true in terms of parameter/evidence estimation. As I outline in the Appendix of the Anyways, what I ultimately mean to say is I think these tests are fantastic and really highlight some of the problems with this particular proposal strategy (with uniform sampling), but that it's not guaranteed to be a total loss. It's also one of the main struggles I had when setting the defaults for most users. Do you have additional recommendations for changes outside of the new set of tests you've submitted and possible utility functions/plots to add in? |
On the first I don't think I quite agree. IMO I'd rather have defaults that are protective, I.e. they are maybe an overkill for simple problems, but are the right thing for difficult ones. Especially since as opposed to say standard MCMC approaches there is not a set of standard convergence tests for nested sampling (a-la Gelman-Rubin/Geweke etc tests) that can give you some confidence. IMO I think bootstrap/sampling can be improved. For example, Currently the scaling of the region is done in all directions, but maybe we can get scaling that is different along different directions to avoid scaling beyond the cube. I also noticed that the sampling within the ellipsoid can be probably parallized as well, which would avoid it being a bottleneck in the case of parallel running. Alternatively I was thinking if the ellipse is that much larger than the unit-cube, maybe we can just sample in a bounding box of the ellipse that fits inside unit cube ? I haven't tested whether that can be benefitial. Anyway I think having a test problem where bootstrap is somewhat stuck is good IMO, to try to improve. In terms of immediate recommendations, for sure validating function evaluations/boundaries after the run is important and alarming about big discrepancies. In my opinion, having the boundary algorithm return the correct boundary with the default settings is important as well (i.e. the tests that I give above). I'd put those in. |
Hi @segasai, would it be possible for you to include the code you used to generate all these diagnostic plots? It might be useful for myself and other users with similar issues in the meantime. Thanks! |
Hi @a-lhn
that will give you the boolean mask for all boundaries and all the points in the run |
Just a note to myself that it might be cool to try and add in some of these utilities as part of the code for users to check as part of the new release (see #254). |
i think a problem that I currently see is that the bound set is not currently properly tracked, i.e. we don't track which logl level each bound correspond to. I.e. the code just does self.bound.append(bound) But ideally one would want some self-contained object with (logl, bound) pairs. That would somewhat help #232 as well. But otherwise we already now can reproduce the plot from the top by this
Part of the reason why I'm interested in quantifying boundaries is that I'm interested in approaches where the first sampling can dictate some kind of transform of data that makes the sampling much easier. I.e. use first dynesty run as exploration, and then sample with much higher sampling efficiency. (I know some developers of zeus-mcmc were thinking/working on such approaches using normalizing flows). |
I think explicitly tracking the (logl -> bounding) object (separate from the actual samples) would be also potentially useful, if one wants to sample the modes separately or deciding on merging of independent runs. I.e. if their Boundary(logl) don't overlap enough they can't be merged. |
I will link the issue #232 here and close it to avoid having two similar issues. |
Hi,
I am looking at the results of the sampling of my complicated problem (with results different from multinest gives me) and I was looking at diagnosing what's going on, so I looked at a few tests that surprised me (below).
First my problem: dimensionality 11, 2 periodic parameters, 2500 live points, I run with sampler 'unif' and dlogz_init=0.01
It doesn't mean it's necessarily wrong, but it is certainly worrisome.
Here you can see again oscillations in volume.
Here you can clearly see the lack of monotonicity. And that even some of the latest live-points are not in previous bounds.
Here again you see two problems: Many latest points are not in first boundaries.
And it looks like only those boundaries consisting of one ellipsoid are good (vertical stripes).
My thoughts on all this:
The text was updated successfully, but these errors were encountered: