You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to ensure that the benchmarks we have produced are correct and shared (there is a Dropbox shared_competition_data folder). I think simple benchmarks such as all zeroes, and last historical value should have team name "benchmark" to separate these from the ViEWS models. The code for simple benchmarks (i.e., deterministic models that are either pure functions or functions of the dependent variable) should be in this repository, as command line tools that could be run on "shared_competition_data/Features/*.parquet" and "shared_competition_data/Actuals" (e.g., you'll need the structure of the Actuals to produce predictions with all zeroes. You'll need the features to know the last historical value).
Then we need to think whether there should be more benchmarks. Here are some ideas:
a) For all units, just collect all historical values (or the last n-months) as the forecast distribution.
b) A model with forecasts with equal probability assigned to each bin (I think as a rule use the closest to zero number in each bin). With e.g., 11 bins, that is ca.9% 0s, 9% 1s, 9% 3s, etc. (Please see the discussion on bins first.)
c) As a), but swap every forecast-value with the closest to zero possible value given the bin it would be in given the binning-scheme. Motivation: Compared to a), does it help to "game" the metrics?
The text was updated successfully, but these errors were encountered:
I agree to 2(a) and 2(c). Last 120 months seems like an ok window?
2(b) is also great. But would it be better to use the log mean in each bin than the closest to zero? Or a draw from a Poisson given the log mean of the bin? For the maximum bin, we have define something. I see what I suggest looks more complex than the lowest value Jonas suggests, and simpler is better if we don't have good reasons for complexity.
a) For all units, just collect all historical values (or the last n-months) as the forecast distribution.
b) A model with forecasts with equal probability assigned to each bin (I think as a rule use the closest to zero number in each bin). With e.g., 11 bins, that is ca.9% 0s, 9% 1s, 9% 3s, etc. (Please see the discussion on bins first.)
c) As a), but swap every forecast-value with the closest to zero possible value given the bin it would be in given the binning-scheme. Motivation: Compared to a), does it help to "game" the metrics?
The text was updated successfully, but these errors were encountered: