More benchmarks #25

kvelleby · 2023-10-13T07:04:26Z

We need to ensure that the benchmarks we have produced are correct and shared (there is a Dropbox shared_competition_data folder). I think simple benchmarks such as all zeroes, and last historical value should have team name "benchmark" to separate these from the ViEWS models. The code for simple benchmarks (i.e., deterministic models that are either pure functions or functions of the dependent variable) should be in this repository, as command line tools that could be run on "shared_competition_data/Features/*.parquet" and "shared_competition_data/Actuals" (e.g., you'll need the structure of the Actuals to produce predictions with all zeroes. You'll need the features to know the last historical value).
Then we need to think whether there should be more benchmarks. Here are some ideas:
a) For all units, just collect all historical values (or the last n-months) as the forecast distribution.
b) A model with forecasts with equal probability assigned to each bin (I think as a rule use the closest to zero number in each bin). With e.g., 11 bins, that is ca.9% 0s, 9% 1s, 9% 3s, etc. (Please see the discussion on bins first.)
c) As a), but swap every forecast-value with the closest to zero possible value given the bin it would be in given the binning-scheme. Motivation: Compared to a), does it help to "game" the metrics?

hhegre · 2023-10-13T08:02:02Z

I agree to 2(a) and 2(c). Last 120 months seems like an ok window?
2(b) is also great. But would it be better to use the log mean in each bin than the closest to zero? Or a draw from a Poisson given the log mean of the bin? For the maximum bin, we have define something. I see what I suggest looks more complex than the lowest value Jonas suggests, and simpler is better if we don't have good reasons for complexity.

kvelleby · 2023-10-13T08:42:28Z

The closest to zero idea was motivated by the fact that finding a suitable value for the edge-bins would be difficult otherwise.

kvelleby · 2023-10-13T08:43:17Z

We should also be looking into other ways to bin. Bayesian blocks is one approach that is quite interesting. https://docs.astropy.org/en/stable/api/astropy.stats.bayesian_blocks.html

kvelleby assigned noorains Oct 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More benchmarks #25

More benchmarks #25

kvelleby commented Oct 13, 2023

hhegre commented Oct 13, 2023

kvelleby commented Oct 13, 2023

kvelleby commented Oct 13, 2023

More benchmarks #25

More benchmarks #25

Comments

kvelleby commented Oct 13, 2023

hhegre commented Oct 13, 2023

kvelleby commented Oct 13, 2023

kvelleby commented Oct 13, 2023