-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stat tests #5
base: master
Are you sure you want to change the base?
Stat tests #5
Conversation
Calculates number of time steps spent in poverty, max. consecutive time steps spent in poverty, and number of technological switches from agent data output.
|
||
|
||
sim_total_steps_in_poverty = sim_data.groupby("AgentID").sum("InPoverty")[["InPoverty"]] | ||
sim_max_consec_steps = sim_data.groupby("AgentID").apply(stu.max_consec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the groupby command pass the income values for a single agent to max_consec and iteratively work through all the agents?
I ask because in max_consec, values is a 1D variable for "in_poverty" key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Groupby
command instatieas a series of viewas on the DF object, one for each value of the parameter to be grouped by, and passes these on individually. For the total steps the code then sums only the "InPoverty" field for each grouped view (i.e. agent) and returns only the summed "InPoverty" column.
for max_consec, the input to the the function is this series of groupoed views one by one. he ode then sets Step, instead of AgentID as the index value enabling a loop over all the steps and gent has performed (tthe iloc command executes on the index. because this is a single element (when selecting the "InPoverty" column) the boolean evaluation works.
Does this answer your question @cpranav93 ?
|
||
parser.add_argument( "--simulation","-s",help="path to simulation output to be evaluated for similarity ", type=str, required=True) | ||
parser.add_argument( "--baseline", "-b", help="Optional. Path to simulation output to be used as baseline", type=str, default=defaultbaseline) | ||
parser.add_argument( "--povertythreshold", "-pt", help="Value for poverty threshold used", type=float,default=defaultpovertythreshold) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not very descriptive what the poverty threshold is. Is this a well known concept for the users/academic peers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. @vmgaribay shall we elaborate this? I recall, that the way you suggeted is some percentage (~10%) of the mean income. The problem with setting it in that manner for this test, is that a change in the implementation might shift the income distribution. To (also) be sensitve to this the idea was to use a fixed values(TBD) defined on the baseline simulation in a separate step. To make it possible to change this without hacking the code I added this CL argument, but aagree it is not very descriptive. Ideas, suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the threshold of 1 was just serving as a static placeholder; originally, it was to be the bottom 10th percentile, but we were trying to reduce variations. I have also seen a percentage of the median population income used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it not be a good idea to set the poverty threshold either very high or very low so that everyone is above or under it, for testing purposes? This should ensure that any randomness introduced in the simulation doesn't throw the tests out of wack as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cpranav93 I don't think I follow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vmgaribay, lets say that if we set poverty_threshold to 0, then every agent would always be above poverty line or vice-versa if we set it to a very high value. This way we can deterministically ensure that the poverty line calculation is tested and working...
but as I say this, I realise that this may not be the point of these tests (since they are not unit tests meant to check each and every functionality!). So feel free to follow along on my crazy journey, or just deboard, tuck and roll off!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see where you were going with it. Yes, this is important for testing the test but not the test itself.
print('mismatch in properties and p values') | ||
else: | ||
for pv, prop in zip(pvalues,properties): | ||
if pv > 0.05: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this p value be a non-required parameter of the program?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this value is taken from the scipy library example? Or is there a relation to our specific problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add a p-value argument. p=0.05 is the standard for 2sigma significance on a test. No specific example, just probability theory. It's vanilla hypothesis testing
if pv > 0.05: | ||
print(f"Null hypothesis of same parent distribution accepted for {prop} at p = {pv} \n") | ||
else: | ||
print(f"Null hypothesis of same parent distribution rejected for {prop} at p = {pv} \n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a global acceptance output as well? As in, the complete test is only accepted when all sub-test are accepted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess so. One point to coonsider here (also pinging @vmgaribay is the possibilty of false negatives and/or different tests weighted towards different parts of the distribution (center vs. tail) which may impact sensitivity.
Also: This test is solely for the purposes of testing outcomes while refactoring the a model with no other changes. As soon as one changes inputs and/or algorithmic principle differences my occur.
Easy to do an aggregation step now, so I'll implement
yes, I see your point. Co-authored-by: Pranav Chandramouli <[email protected]>
stat_tests/stat_test_utils.py
Outdated
#Incase some one was in proverty till the end, do a check at the end for maximum days (niche case) | ||
if tally > maximum: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to be Indented in line with the rest of the code.
Sorry, forgot about it in my suggestion.
fixed indentation
This branch provides the ability to perform statistical tests on the similarity of key distributions resulting from the PovertyTrap simulations.
stat_test.py provides a script as well aas importable function to poerform these tests.
Currrently output of tests is ONLY printed to screen. This should be changed nd versiond when incorporating for full CI