Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What to do about samples with negative values? #18

Open
kvelleby opened this issue Oct 12, 2023 · 0 comments
Open

What to do about samples with negative values? #18

kvelleby opened this issue Oct 12, 2023 · 0 comments

Comments

@kvelleby
Copy link
Contributor

The competition is about predicting counts of fatalities. Counts are positive integers.

Yet, many prediction models yield continuous floats, possibly also negative numbers.

None of the metrics intrinsically fail when getting negative numbers, but the current binning-scheme of the Ignorance Score does not expect negative numbers, yielding an error.

Alternatives:

  • Truncate all predictions at 0. This will affect the distribution of predictions.
  • Remove predictions below 0 and resample based on positive predictions. This means we have to infer a new kind of distribution (other than what the model did output). Our current upsampling approach (scipy.signal.resample) can yield negative numbers, so we would need to find another upsampling approach.
  • Allow negative predictions and add a bin for negative numbers in the ignorance score.
  • (Current) Allow for negative predictions only in CRPS and MIS, and truncate predictions at 0 for the Ignorance Score.

Happy to get input here. I think the most important is to settle on one approach and be clear to everyone. Personally, I am sceptical to us doing upsampling with anything else than scipy.signal.resample. Finding a suitable positive-only distribution, I think, is the responsibility of the contestants. If they are giving us negative values (or less than 1000 values), we have to use our pre-described and simple approach to adjust. I am also leaning towards allowing negative predictions and adding a bin for negative numbers in the ignorance score (but still truncating at 0 when we have to do upsampling due to getting less than 1000 samples).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant