Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DPAD for regions #52187

Merged
merged 1 commit into from
May 5, 2021
Merged

DPAD for regions #52187

merged 1 commit into from
May 5, 2021

Conversation

Maoni0
Copy link
Member

@Maoni0 Maoni0 commented May 3, 2021

Introduce the DPAD feature (Dynamic Promotion And Demotion) feature for regions

This feature consists of the following -

  1. Record promoted bytes per region so we can make decisions which regions to promote

I'm reusing g_mark_list_piece to record the survived and old card survived per region since it's not used during the
mark phase till sort_mark_list is called. So this is now defined for both wks and svr GC.

old card survived is only used if the plan_gen_num for that GC isn't max_generation already, otherwise we'd be
promoting into max_generation anyway.

Note the way I'm calculating the survived for UOH regions isn't correct (I'd need to merge them from each basic region) but I also don't need the survived bytes for UOH.

For dbg builds I also record the survived bytes with the g_promoted to validate that survived_per_region is correct.

REGIONS TODO: we should treat g_mark_list_piece as part of the GC bookkeeping data structures and allocate it as such.

REGIONS TODO: currently we only make use of SOH's promoted bytes to make decisions whether we want to compact or sweep
a region. We should also enable this for LOH compaction.

  1. SIP (Swept In Plan) regions

This feature introduces SIP regions. Previously, the compact or sweep decision is only made after the plan phase is complete.
But since we know how much each region survives before the plan phase, and if a region's survival rate is very high, we know
it's not productive to compact it. Therefore we do not need to plan it, we can just sweep it when we encounter it during plan.
And if most of that survival is due to card marking from gen2, we also know that very unlikely it'd be productive to promote it
into gen2 right away, instead of going through gen1 then gen2. So we do that too.

This means SIP regions cannot be used as generation_allocation_segment for allocate_in_condemned_generations and in the later phases they need to skip them as we are already done with them during plan (which means their plan_gen_num is preserved since we already decided in plan). In process_last_np_surv_region/process_remaining_regions when we are going through regions we also need to make sure we do not change the plan_gen_num for SIP regions there.

SIP regions' bricks are not built the same way we build bricks in plan as there's no need - we will not need to go through them
via the tree. So we build the bricks accurately with objects since we know exactly which objects will be in this region. This
does mean in relocate_address we need to make sure to not look at the brick table and return the same address if we are asked to relocate an address in an SIP region. For relocate_survivors of course we do need to relocate refs in these regions but in a very
different fashion - we do this in relocate_advance_to_non_sip which will only return non SIP regions to relocate_survivors.

Complications these SIP regions introduce -

  • When we decide if we should make a region SIP, if the plan_gen_num of a generation is not maxgen, and if we want to make every region in that generation maxgen, we need to make sure we can get a new region for this generation so we can guarantee each generation has at least one region. If we can't get a new region, we need to make sure we leave at least one region in that gen to guarantee our invariant.

This new region we get needs to be temporarily recorded instead of being on the free_regions list because we can't use it for other purposes.

  • In sweep_region_in_plan, we temporarily thread the large enough free objects onto the region's own FL which will then be threaded onto the appropriate generation's FL. We don't thread this right away because we might need to restore gen2's FL if we decide to sweep.

  • In allocate_in_condemned_generations we need to be careful - if we are skipping regions when we go to the next region, we need to adjust the alloc ptr accordingly otherwise it'd be out of sync with alloc limit which will cause problems for detecting pinned plugs.

  • In relocate phase when we check for demotion we have a new scenario with SIP -

A points to B
A starts out as a gen0 obj
B starts out as a gen1 obj
no cards are needed in this case
but A is now a gen2 obj, so a card is needed. In the current check_for_demotion_helper we only need to check if the child object is in a region that's demoted but for this new scenario we need to check the child obj's plan_gen_num against the parent's plan_gen_num which is handled by the new check_demotion_helper_sip.

  • For sweep, we do need to promote all regions as this is a contract right now with the handle table. So I do promote them as sweep normally does. Requires more work to change this and I will not include it with this PR.

REGIONS TODO: make SIP regions keep their plan_gen_num during sweep.

  1. Rewrote the final region threading

A region in a condemned generation can end up in any generation.

Got rid of generation_plan_start_segment. I found keeping it to be counter productive especially with the introduction of SIP
regions which made going through regions more complicated. It also made it so that we can return any empty region back to the free region pool.

In find_first_valid_region thread the FL for SIP regions onto their corresponding generation's FL.

--

Misc

  • Added some code to stress SIP but for the checkin I disabled it as making a lot of regions SIP would just make us run out of
    memory quickly.

REGIONS TODO: this can be improved, eg, don't make a region SIP unless it fits the SIP criteria during a full blocking GC so
we don't get premature OOM.

  • In process_remaining_regions, if a gen0 regions that only contain pinned plugs have too much survived, we don't demote it.

  • Fixed a bug when setting the internal basic region's gen_num on UOH regions, this should be the logic gen num 2 instead of
    physical gen num for these generations.

  • I've made gen_num in heap_segment a byte instead of an int.

REGIONS TODO: Should do the same with plan_gen_num. There are more optimizations we can do to shrink this datastructure.

  • We always use MARK_LIST so got rid of the MARK_LIST define.

--

More REGIONS TODO

  • dd_survived_size includes padding but the promoted bytes we record don't. Should unify this.

  • I don't actually need survived_per_region for WKS, I can get rid of it for WKS to keep the datastructure only allocated for SVR. I actually could make the decision for SIP regions as soon as we are done with marking so I don't need to store survived and old_card_survived on heap_segment (it however could be potentially nice for instrumentation purpose). But I'll leave this for another PR.

  • possible optimization for get_promoted_bytes (note that when we switch to only use the used part of our reserved range it means most regions we look at will be used).

===============

What I tested -

WKS:
corerun gcperfsim.dll -tc 2 -tagb 64 -tlgb 0.05 -lohar 0 -sohsi 10 -lohsi 0 -pohsi 0 -sohpi 0 -lohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 60000
corerun gcperfsim.dll -tc 2 -tagb 64 -tlgb 0.05 -lohar 100 -sohsi 10 -lohsi 100 -pohsi 0 -sohpi 0 -lohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 60000

SVR: same cmdline except -tc 4 -tagb 128
with
set complus_GCGen0MaxBudget=1000000

@ghost
Copy link

ghost commented May 3, 2021

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

Introduce the DPAD feature (Dynamic Promotion And Demotion) feature for regions

This feature consists of the following -

  1. Record promoted bytes per region so we can make decisions which regions to promote

I'm reusing g_mark_list_piece to record the survived and old card survived per region since it's not used during the
mark phase till sort_mark_list is called. So this is now defined for both wks and svr GC.

old card survived is only used if the plan_gen_num for that GC isn't max_generation already, otherwise we'd be
promoting into max_generation anyway.

For dbg builds I also record the survived bytes with the g_promoted to validate that survived_per_region is correct.

REGIONS TODO: we should treat g_mark_list_piece as part of the GC bookkeeping data structures and allocate it as such.

REGIONS TODO: currently we only make use of SOH's promoted bytes to make decisions whether we want to compact or sweep
a region. We should also enable this for LOH compaction.

  1. SIP (Swept In Plan) regions

This feature introduces SIP regions. Previously, the compact or sweep decision is only made after the plan phase is complete.
But since we know how much each region survives before the plan phase, and if a region's survival rate is very high, we know
it's not productive to compact it. Therefore we do not need to plan it, we can just sweep it when we encounter it during plan.
And if most of that survival is due to card marking from gen2, we also know that very unlikely it'd be productive to promote it
into gen2 right away, instead of going through gen1 then gen2. So we do that too.

This means SIP regions cannot be used as generation_allocation_segment for allocate_in_condemned_generations and in the later
phases they need to skip them as we are already done with them during plan (which means their plan_gen_num is preserved since we
already decided in plan). In process_last_np_surv_region/process_remaining_regions when we are going through regions we also need
to make sure we do not change the plan_gen_num for SIP regions there.

SIP regions' bricks are not built the same way we build bricks in plan as there's no need - we will not need to go through them
via the tree. So we build the bricks accurately with objects since we know exactly which objects will be in this region. This
does mean in relocate_address we need to make sure to not look at the brick table and return the same address if we are asked to
relocate an address in an SIP region. For relocate_survivors of course we do need to relocate refs in these regions but in a very
different fashion - we do this in relocate_advance_to_non_sip which will only return non SIP regions to relocate_survivors.

Complications these SIP regions introduce -

  • When we decide if we should make a region SIP, if the plan_gen_num of a generation is not maxgen, and if we want to make every
    region in that generation maxgen, we need to make sure we can get a new region for this generation so we can guarantee each
    generation has at least one region. If we can't get a new region, we need to make sure we leave at least one region in that gen
    to guarantee our invariant.

This new region we get needs to be temporarily recorded instead of being on the free_regions list because we can't use it for other
purposes.

  • In sweep_region_in_plan, we temporarily thread the large enough free objects onto the region's own FL which will then be threaded
    onto the appropriate generation's FL. We don't thread this right away because we might need to restore gen2's FL if we decide to
    sweep.

  • In allocate_in_condemned_generations we need to be careful - if we are skipping regions when we go to the next region, we need
    to adjust the alloc ptr accordingly otherwise it'd be out of sync with alloc limit which will cause problems for detecting pinned
    plugs.

  • In relocate phase when we check for demotion we have a new scenario with SIP -

A points to B
A starts out as a gen0 obj
B starts out as a gen1 obj
no cards are needed in this case
but A is now a gen2 obj, so a card is needed. In the current check_for_demotion_helper we only need to check if the child object is
in a region that's demoted but for this new scenario we need to check the child obj's plan_gen_num against the parent's plan_gen_num
which is handled by the new check_demotion_helper_sip.

  • For sweep, we do need to promote all regions as this is a contract right now with the handle table. So I do promote them as sweep
    normally does. Requires more work to change this and I will not include it with this PR.

REGIONS TODO: make SIP regions keep their plan_gen_num during sweep.

  1. Rewrote the final region threading

Got rid of generation_plan_start_segment. I found keeping it to be counter productive especially with the introduction of SIP
regions which made going through regions more complicated. It also made it so that we can return any empty region back to the
free region pool.

In find_first_valid_region thread the FL for SIP regions onto their corresponding generation's FL.

--

Misc

  • Added some code to stress SIP but for the checkin I disabled it as making a lot of regions SIP would just make us run out of
    memory quickly.

REGIONS TODO: this can be improved, eg, don't make a region SIP unless it fits the SIP criteria during a full blocking GC so
we don't get premature OOM.

  • In process_remaining_regions, if a gen0 regions that only contain pinned plugs have too much survived, we don't demote it.

  • Fixed a bug when setting the internal basic region's gen_num on UOH regions, this should be the logic gen num 2 instead of
    physical gen num for these generations.

  • I've made gen_num in heap_segment a byte instead of an int.

REGIONS TODO: Should do the same with plan_gen_num. There are more optimizations we can do to shrink this datastructure.

  • We always use MARK_LIST so got rid of the MARK_LIST define.

--

More REGIONS TODO

dd_survived_size includes padding but the promoted bytes we record don't. Should unify this.

===============

What I tested -

WKS:
corerun gcperfsim.dll -tc 2 -tagb 64 -tlgb 0.05 -lohar 0 -sohsi 10 -lohsi 0 -pohsi 0 -sohpi 0 -lohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 60000
corerun gcperfsim.dll -tc 2 -tagb 64 -tlgb 0.05 -lohar 100 -sohsi 10 -lohsi 100 -pohsi 0 -sohpi 0 -lohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 60000

SVR: same cmdline except -tc 4 -tagb 128

Author: Maoni0
Assignees: -
Labels:

area-GC-coreclr

Milestone: -

…for regions

This feature consists of the following -

1) Record promoted bytes per region so we can make decisions which regions to promote

I'm reusing g_mark_list_piece to record the survived and old card survived per region since it's not used during the
mark phase till sort_mark_list is called. So this is now defined for both wks and svr GC.

old card survived is only used if the plan_gen_num for that GC isn't max_generation already, otherwise we'd be
promoting into max_generation anyway.

For dbg builds I also record the survived bytes with the g_promoted to validate that survived_per_region is correct.

REGIONS TODO: we should treat g_mark_list_piece as part of the GC bookkeeping data structures and allocate it as such.

REGIONS TODO: currently we only make use of SOH's promoted bytes to make decisions whether we want to compact or sweep
a region. We should also enable this for LOH compaction.

2) SIP (Swept In Plan) regions

This feature introduces SIP regions. Previously, the compact or sweep decision is only made after the plan phase is complete.
But since we know how much each region survives before the plan phase, and if a region's survival rate is very high, we know
it's not productive to compact it. Therefore we do not need to plan it, we can just sweep it when we encounter it during plan.
And if most of that survival is due to card marking from gen2, we also know that very unlikely it'd be productive to promote it
into gen2 right away, instead of going through gen1 then gen2. So we do that too.

This means SIP regions cannot be used as generation_allocation_segment for allocate_in_condemned_generations and in the later
phases they need to skip them as we are already done with them during plan (which means their plan_gen_num is preserved since we
already decided in plan). In process_last_np_surv_region/process_remaining_regions when we are going through regions we also need
to make sure we do not change the plan_gen_num for SIP regions there.

SIP regions' bricks are not built the same way we build bricks in plan as there's no need - we will not need to go through them
via the tree. So we build the bricks accurately with objects since we know exactly which objects will be in this region. This
does mean in relocate_address we need to make sure to not look at the brick table and return the same address if we are asked to
relocate an address in an SIP region. For relocate_survivors of course we do need to relocate refs in these regions but in a very
different fashion - we do this in relocate_advance_to_non_sip which will only return non SIP regions to relocate_survivors.

Complications these SIP regions introduce -

+ When we decide if we should make a region SIP, if the plan_gen_num of a generation is not maxgen, and if we want to make every
region in that generation maxgen, we need to make sure we can get a new region for this generation so we can guarantee each
generation has at least one region. If we can't get a new region, we need to make sure we leave at least one region in that gen
to guarantee our invariant.

This new region we get needs to be temporarily recorded instead of being on the free_regions list because we can't use it for other
purposes.

+ In sweep_region_in_plan, we temporarily thread the large enough free objects onto the region's own FL which will then be threaded
onto the appropriate generation's FL. We don't thread this right away because we might need to restore gen2's FL if we decide to
sweep.

+ In allocate_in_condemned_generations we need to be careful - if we are skipping regions when we go to the next region, we need
to adjust the alloc ptr accordingly otherwise it'd be out of sync with alloc limit which will cause problems for detecting pinned
plugs.

+ In relocate phase when we check for demotion we have a new scenario with SIP -

A points to B
A starts out as a gen0 obj
B starts out as a gen1 obj
no cards are needed in this case
but A is now a gen2 obj, so a card is needed. In the current check_for_demotion_helper we only need to check if the child object is
in a region that's demoted but for this new scenario we need to check the child obj's plan_gen_num against the parent's plan_gen_num
which is handled by the new check_demotion_helper_sip.

+ For sweep, we do need to promote all regions as this is a contract right now with the handle table. So I do promote them as sweep
normally does. Requires more work to change this and I will not include it with this PR.

REGIONS TODO: make SIP regions keep their plan_gen_num during sweep.

3) Rewrote the final region threading

Got rid of generation_plan_start_segment. I found keeping it to be counter productive especially with the introduction of SIP
regions which made going through regions more complicated. It also made it so that we can return any empty region back to the
free region pool.

In find_first_valid_region thread the FL for SIP regions onto their corresponding generation's FL.

--

Misc

+ Added some code to stress SIP but for the checkin I disabled it as making a lot of regions SIP would just make us run out of
memory quickly.

REGIONS TODO: this can be improved, eg, don't make a region SIP unless it fits the SIP criteria during a full blocking GC so
we don't get premature OOM.

+ In process_remaining_regions, if a gen0 regions that only contain pinned plugs have too much survived, we don't demote it.

+ Fixed a bug when setting the internal basic region's gen_num on UOH regions, this should be the logic gen num 2 instead of
physical gen num for these generations.

+ I've made gen_num in heap_segment a byte instead of an int.

REGIONS TODO: Should do the same with plan_gen_num. There are more optimizations we can do to shrink this datastructure.

+ We always use MARK_LIST so got rid of the MARK_LIST define.

--

More REGIONS TODO

dd_survived_size includes padding but the promoted bytes we record don't. Should unify this.

===============

What I tested -

WKS:
corerun gcperfsim.dll -tc 2 -tagb 64 -tlgb 0.05 -lohar 0 -sohsi 10 -lohsi 0 -pohsi 0 -sohpi 0 -lohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 60000
corerun gcperfsim.dll -tc 2 -tagb 64 -tlgb 0.05 -lohar 100 -sohsi 10 -lohsi 100 -pohsi 0 -sohpi 0 -lohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time -printEveryNthIter 60000

SVR: same cmdline except -tc 4 -tagb 128
@Maoni0
Copy link
Member Author

Maoni0 commented May 4, 2021

the arm64 build error is due to known issue #48070

@Maoni0 Maoni0 merged commit 648407d into dotnet:main May 5, 2021
@karelz karelz added this to the 6.0.0 milestone May 20, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Jun 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants