Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stat tests #5

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Stat tests #5

wants to merge 6 commits into from

Conversation

meiertgrootes
Copy link
Member

This branch provides the ability to perform statistical tests on the similarity of key distributions resulting from the PovertyTrap simulations.

stat_test.py provides a script as well aas importable function to poerform these tests.

Currrently output of tests is ONLY printed to screen. This should be changed nd versiond when incorporating for full CI

meiertgrootes and others added 4 commits June 20, 2023 14:06
Calculates number of time steps spent in poverty, max. consecutive time steps spent in poverty, and number of technological switches from agent data output.
@meiertgrootes meiertgrootes requested a review from debraj001 June 26, 2023 09:41


sim_total_steps_in_poverty = sim_data.groupby("AgentID").sum("InPoverty")[["InPoverty"]]
sim_max_consec_steps = sim_data.groupby("AgentID").apply(stu.max_consec)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the groupby command pass the income values for a single agent to max_consec and iteratively work through all the agents?

I ask because in max_consec, values is a 1D variable for "in_poverty" key.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Groupby command instatieas a series of viewas on the DF object, one for each value of the parameter to be grouped by, and passes these on individually. For the total steps the code then sums only the "InPoverty" field for each grouped view (i.e. agent) and returns only the summed "InPoverty" column.

for max_consec, the input to the the function is this series of groupoed views one by one. he ode then sets Step, instead of AgentID as the index value enabling a loop over all the steps and gent has performed (tthe iloc command executes on the index. because this is a single element (when selecting the "InPoverty" column) the boolean evaluation works.

Does this answer your question @cpranav93 ?


parser.add_argument( "--simulation","-s",help="path to simulation output to be evaluated for similarity ", type=str, required=True)
parser.add_argument( "--baseline", "-b", help="Optional. Path to simulation output to be used as baseline", type=str, default=defaultbaseline)
parser.add_argument( "--povertythreshold", "-pt", help="Value for poverty threshold used", type=float,default=defaultpovertythreshold)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very descriptive what the poverty threshold is. Is this a well known concept for the users/academic peers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. @vmgaribay shall we elaborate this? I recall, that the way you suggeted is some percentage (~10%) of the mean income. The problem with setting it in that manner for this test, is that a change in the implementation might shift the income distribution. To (also) be sensitve to this the idea was to use a fixed values(TBD) defined on the baseline simulation in a separate step. To make it possible to change this without hacking the code I added this CL argument, but aagree it is not very descriptive. Ideas, suggestions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the threshold of 1 was just serving as a static placeholder; originally, it was to be the bottom 10th percentile, but we were trying to reduce variations. I have also seen a percentage of the median population income used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be a good idea to set the poverty threshold either very high or very low so that everyone is above or under it, for testing purposes? This should ensure that any randomness introduced in the simulation doesn't throw the tests out of wack as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpranav93 I don't think I follow

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vmgaribay, lets say that if we set poverty_threshold to 0, then every agent would always be above poverty line or vice-versa if we set it to a very high value. This way we can deterministically ensure that the poverty line calculation is tested and working...

but as I say this, I realise that this may not be the point of these tests (since they are not unit tests meant to check each and every functionality!). So feel free to follow along on my crazy journey, or just deboard, tuck and roll off!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see where you were going with it. Yes, this is important for testing the test but not the test itself.

print('mismatch in properties and p values')
else:
for pv, prop in zip(pvalues,properties):
if pv > 0.05:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this p value be a non-required parameter of the program?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this value is taken from the scipy library example? Or is there a relation to our specific problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a p-value argument. p=0.05 is the standard for 2sigma significance on a test. No specific example, just probability theory. It's vanilla hypothesis testing

if pv > 0.05:
print(f"Null hypothesis of same parent distribution accepted for {prop} at p = {pv} \n")
else:
print(f"Null hypothesis of same parent distribution rejected for {prop} at p = {pv} \n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a global acceptance output as well? As in, the complete test is only accepted when all sub-test are accepted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess so. One point to coonsider here (also pinging @vmgaribay is the possibilty of false negatives and/or different tests weighted towards different parts of the distribution (center vs. tail) which may impact sensitivity.

Also: This test is solely for the purposes of testing outcomes while refactoring the a model with no other changes. As soon as one changes inputs and/or algorithmic principle differences my occur.

Easy to do an aggregation step now, so I'll implement

yes, I see your point.

Co-authored-by: Pranav Chandramouli <[email protected]>
Comment on lines 22 to 23
#Incase some one was in proverty till the end, do a check at the end for maximum days (niche case)
if tally > maximum:
Copy link
Contributor

@cpranav93 cpranav93 Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be Indented in line with the rest of the code.

Sorry, forgot about it in my suggestion.

fixed indentation
cpranav93 pushed a commit that referenced this pull request Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants