Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] Enable dynamic filter #879

Draft
wants to merge 45 commits into
base: main
Choose a base branch
from
Draft

[Feat] Enable dynamic filter #879

wants to merge 45 commits into from

Conversation

petar-qb
Copy link
Contributor

@petar-qb petar-qb commented Nov 15, 2024

Description

This PR introduces dynamic filters for the following selectors (so, all Vizro selectors except vm.DatePicker):
vm.Dropdown, vm.RadioItems, vm.Checklist, vm.Slider and vm.RangeSlider

You can test the feature by altering values in the scratch_dev/data.yaml (in the way that's described in the comment) and running the scratch_dev/app.py.

TODOs:

This PR TODOs:

This or next PR TODOs (we should decide it):

  • - Support dynamic for the vm.DatePicker. (Maybe it's better to wait for the dash persistence bugfix as persistence doesn't work for the vm.DatePicker even on the main branch)

Next PRs:

  • - Enhance the default selector.value handling for new users. There are a few cases marked with 🟠 in this Issue -> https://github.com/McK-Internal/vizro-internal/issues/1356. There are two inconsistencies listed at the bottom of the Issue description. This TODO points to the second inconsistency.
  • - Introduce "Universal Vizro placeholder component" -> https://github.com/McK-Internal/vizro-internal/issues/1307
  • - Fix the multi=False vm.Dropdown dynamic selector bug when the value is cleared. This will be solved when Vizro universal placeholder component become introduced.
  • - Enable dynamic filter to work with the empty data_frame.
  • - Propagate data_frame Parameter default values from the model_manager into the DM._multi_load() that's called from the vm.Filter.pre_build(). -> PoC can be found in the comment.
  • - Implement "Select ALL" for the multi=True categorical selectors.
  • - Implement "Select the entire range" for the multi=True numerical selectors.

References:

Open questions:

  1. How to enable dynamic Parameters? Should we allow, for example, the "options" property of a categorical parameter selector to be a function that dynamically calculates and returns new options?
  2. How to enable a dynamic filter to be targeted by any predefined action? For example the parameter_action target. Or like this: update_figures(targets=[”filter_1_id”])
  3. What exactly is a new way of dash persistence handling we expect (this fix will be implemented by us)?
  4. If we introduce the data_frame property for the vm.Filter component, and if the data_frame Parameters change the targeting form to data_manager_key.function_argument, does is mean that our dynamic Filters could be handled in a same way as any other dynamic figure components?

Notice

  • I acknowledge and agree that, by checking this box and clicking "Submit Pull Request":

    • I submit this contribution under the Apache 2.0 license and represent that I am entitled to do so on behalf of myself, my employer, or relevant third parties, as applicable.
    • I certify that (a) this contribution is my original creation and / or (b) to the extent it is not my original creation, I am authorized to submit this contribution on behalf of the original creator(s) or their licensees.
    • I certify that the use of this contribution as authorized by the Apache 2.0 license does not violate the intellectual property rights of anyone else.
    • I have not referenced individuals, products or companies in any commits, directly or indirectly.
    • I have not added data or restricted code in any commits, directly or indirectly.

antonymilne and others added 29 commits November 4, 2024 10:52
# Conflicts:
#	vizro-core/src/vizro/models/_controls/filter.py
…_manager to the Filter.pre_build data_manager _multi_load
@github-actions github-actions bot added the Vizro-AI 🤖 Issue/PR that addresses Vizro-AI package label Nov 15, 2024
Comment on lines 1 to 13
# Choose from 0-50
setosa: 5
versicolor: 10
virginica: 15

# Choose from: 4.8 to 7.4
min: 5
max: 7

# Choose from:
# 2020-01-01 to 2020-05-29
date_min: 2024-01-01
date_max: 2024-05-29
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusting these values, the input data is changed for the scratch/app.py example.

Copy link
Contributor Author

@petar-qb petar-qb Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By adjusting setosa, versicolor and virginica (from 0 to 50), example graphs that use the load_from_file_species will be affected (page_1, page_2, page_5, page_6).

By adjusting min, max (from 4.8 to 7.4), example graphs that use the load_from_file_sepal_length will be affected (page_3).

TODO: The following still doesn't work so enable page_4 and vm.DatePicker to support dynamic mode.
By adjusting date_min, date_max (from 2020-01-01 to 2020-05-29), example graphs that use the load_from_file_date_column will be affected (page_4).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this method of testing, it's much better than just selecting random points and refreshing the page lots of times 💯

@@ -54,7 +54,7 @@ def validate_value(cls, value, values):
[entry["value"] for entry in values["options"]] if isinstance(values["options"][0], dict) else values["options"]
)

if value and not is_value_contained(value, possible_values):
if value and ALL_OPTION not in value and not is_value_contained(value, possible_values):
Copy link
Contributor Author

@petar-qb petar-qb Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The possible_values here is always a self.options prop that does not contain "ALL", and this validation function is invoked from the _build_dynamic_placeholder method. Previously self.value was not set anywhere from code, so this didn't fail.

Comment on lines 221 to 223
# TODO: Align inner and outer ids to be handled in the same way as for other figure components.
selector_build_obj = self.selector.build()
return dcc.Loading(id=self.id, children=selector_build_obj) if self._dynamic else selector_build_obj
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the inner/outer id mechanism here aligned with how it's done for other Vizro figure components?

Comment on lines 221 to 223
# TODO: Align inner and outer ids to be handled in the same way as for other figure components.
selector_build_obj = self.selector.build()
return dcc.Loading(id=self.id, children=selector_build_obj) if self._dynamic else selector_build_obj
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any UI improvements we should introduce until the "Universal Vizro placeholder component" comes?

@petar-qb petar-qb changed the title Feat/dynamic filter [Feat] Enable dynamic filter Nov 18, 2024
Comment on lines 155 to 157
multi_data_source_name_load_kwargs: list[tuple[DataSourceName, dict[str, Any]]] = [
(model_manager[target]["data_frame"], {}) for target in proposed_targets
]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible to send default DFP values from the model_manager into the data_manager._multi_load() with the following code:
(Disclaimer: The code is just a PoC and could be optimised a lot)

        from vizro.models._controls import Parameter

        multi_data_source_name_load_kwargs: list[tuple[DataSourceName, dict[str, Any]]] = []

        # One tuple per filter.target
        # [
        #     ('data_1', {'arg_1': 1, 'arg_2': 2,}),
        #     ('data_2', {'X': "ASD"}),
        #     ('data_2', {'X': "qwe"}),
        # ]

        # TODO-NEXT: The code below is just a PoC and could be improved a lot.
        page_obj = model_manager[model_manager._get_model_page_id(model_id=ModelID(str(self.id)))]
        for target in proposed_targets:
            data_source_name = model_manager[target]["data_frame"]
            load_kwargs = {}

            for page_parameter in page_obj.controls:
                if isinstance(page_parameter, Parameter):
                    for parameter_targets in page_parameter.targets:
                        if parameter_targets.startswith(f'{target}.data_frame'):
                            argument = parameter_targets.split('.')[2]
                            # argument is explicitly defined
                            if parameter_value := getattr(page_parameter.selector, 'value', None):
                                load_kwargs[argument] = parameter_value
                            # find default value
                            else:
                                parameter_selector = page_parameter.selector
                                default_parameter_value = None
                                if isinstance(parameter_selector, Dropdown):
                                    default_parameter_value = parameter_selector.options if parameter_selector.multi else parameter_selector.options[0]
                                elif isinstance(parameter_selector, Checklist):
                                    default_parameter_value = parameter_selector.options
                                elif isinstance(parameter_selector, RadioItems):
                                    default_parameter_value = parameter_selector.options[0]
                                elif isinstance(parameter_selector, Slider):
                                    default_parameter_value = parameter_selector.min
                                elif isinstance(parameter_selector, RangeSlider):
                                    default_parameter_value = [parameter_selector.min, parameter_selector.max]

                                load_kwargs[argument] = default_parameter_value

            multi_data_source_name_load_kwargs.append((data_source_name, load_kwargs))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for posting this here! Just to add my latest thoughts and FYI @maxschulz-COL.

Currently dynamic data functions require a default value for every argument. Even when there is a dataframe parameter, the default value is used when pre-build the filter e.g. to find the targets, column type (and hence selector) and initial values.

There are three possible solutions here:

  1. leave this "default value must be supplied" restriction so arguments passed to dynamic data function (through _multi_load) are just {}
  2. lift the restriction and take values used from the model_manager to pass through to _multi_load. This should always work because a parameter always has a default value even if it's not explicitly specified by the user. It introduces more coupling with the model manager though.
  3. lift the restriction and just don't prebuild anything much for a dynamic filter. This is simplest/best but probably doesn't work at the moment because you need to know targets upfront to build actions etc. And you can't obtain targets without loading data, which requires some values for the parametrised dynamic data loading functions.

For now we're going for option 1 but may be revisited in future, especially after we've changed how model_manager works.

@petar-qb please could you put a TODO (no NEXT needed) in here just summarising and referring to this github comment? It can replace the existing TODO NEXT comment. Make sure that the reminder about static data needing {} remains though 🙏

Copy link
Contributor

@antonymilne antonymilne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome work! 💯 As I said yesterday, the code changes here don't do credit to the many hours of understanding required to get this all working.

I've left quite a few initial comments and will then do another review. I think all is on track for approval though 👍 Let me know if you want to discuss any of the comments in a call or over slack and we can do so before Thursday.

Let me look through your questions from the PR description and answer them separately.

vizro-core/examples/scratch_dev/app.py Outdated Show resolved Hide resolved
vizro-core/examples/scratch_dev/app.py Outdated Show resolved Hide resolved
vizro-core/examples/scratch_dev/app.py Outdated Show resolved Hide resolved
vizro-core/examples/scratch_dev/app.py Show resolved Hide resolved
Comment on lines 1 to 13
# Choose from 0-50
setosa: 5
versicolor: 10
virginica: 15

# Choose from: 4.8 to 7.4
min: 5
max: 7

# Choose from:
# 2020-01-01 to 2020-05-29
date_min: 2024-01-01
date_max: 2024-05-29
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this method of testing, it's much better than just selecting random points and refreshing the page lots of times 💯

Comment on lines 54 to 55
def _build_static(self, new_options=None, **kwargs):
options = new_options if new_options else self.options
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there's good reason to not do it this way let's make these arguments non-optional and just supply them all the time. Hiding the defaulting behaviour inside the _build_static function is more confusing I think.

Suggested change
def _build_static(self, new_options=None, **kwargs):
options = new_options if new_options else self.options
def _build_static(self, options):

Similar comment across all selectors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely a better approach. I did it for the options, min and max.

However, I'm still propagating the current_value to the numerical selectors only to support dcc.Input and dcc.Store components to work properly with the persistence in numerical selectors when they work in the dynamic mode. The current_value propagation can be removed when the dash persistence bug is fixed. (I added this as a comment in the filter.py where the current_value is propagated to the selector.__call__())

vizro-core/src/vizro/models/_components/form/checklist.py Outdated Show resolved Hide resolved

def _build_static(self, new_options=None, **kwargs):
options = new_options if new_options else self.options
full_options, default_value = get_options_and_default(options=options, multi=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it right that we might change the value supplied here? Is this the case where it only works because we're lucky and we have the ALL option?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but I don't get you..
The get_options_and_default retrieves options without the "ALL" and returns the options with the "ALL" option (if the selector is multi=True). It also returns "ALL" as the default_value for the multi=True, and options[0] for the multi=False selectors.

@@ -62,3 +67,15 @@ def build(self):
),
]
)

def _build_dynamic_placeholder(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to myself to read through this again to understand it.

Comment on lines 66 to 69
def _build_static(self, current_value=None, new_min=None, new_max=None, **kwargs):
_min = new_min if new_min else self.min
_max = new_max if new_max else self.max
init_value = current_value or self.value or _min
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me what's happening here? Why do we need current_value here and not just the min and max?

Also, following comment elsewhere let's do something like this unless there's good reason to not do so:

Suggested change
def _build_static(self, current_value=None, new_min=None, new_max=None, **kwargs):
_min = new_min if new_min else self.min
_max = new_max if new_max else self.max
init_value = current_value or self.value or _min
def _build_static(self, value, min, max, **kwargs):
init_value = value or min

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it's described in this thread, the current_value is here to enable persistence to work properly with the slider and range_slider.

These two selectors have a specific way of handling the persistence due to "two values bound problem". The current_value propagation can be removed when the dash bug is fixed.
I'll also write down all the potential improvements that will be possible after the dash persistence bugfix 😃

@antonymilne
Copy link
Contributor

antonymilne commented Nov 19, 2024

This or next PR TODOs (we should decide it):

  • - Support dynamic for the vm.DatePicker. (Maybe it's better to wait for the dash persistence bugfix as persistence doesn't work for the vm.DatePicker even on the main branch)

Definitely not urgent. Let's wait for this and fix plotly/dash#2678 first.

Next PRs:

How many of these will still be problems once we fix plotly/dash#2678?

  • - Enable dynamic filter to work with the empty data_frame.

Good idea - let's discuss how this works now and how it should work.

  • - Propagate data_frame Parameter default values from the model_manager into the DM._multi_load() that's called from the vm.Filter.pre_build(). -> PoC can be found in the comment.

Added to the comment there but basically let's leave as a TODO and open a ticket. Not urgent.

  • - Implement "Select ALL" for the multi=True categorical selectors.

Would love to do this soon (like by end of year if possible).

  • - Implement "Select the entire range" for the multi=True numerical selectors.

Not as high priority - let's make a ticket and forget for now.

Open questions:

  1. How to enable dynamic Parameters? Should we allow, for example, the "options" property of a categorical parameter selector to be a function that dynamically calculates and returns new options?

Good question but I think not super high priority unless we have an immediate use for it.

  1. How to enable a dynamic filter to be targeted by any predefined action? For example the parameter_action target. Or like this: update_figures(targets=[”filter_1_id”])

Very good question and presumably required for us to get DFPs to update filters without reloading the page? So this one needs to be worked out soon.

  1. What exactly is a new way of dash persistence handling we expect (this fix will be implemented by us)?

Let's discuss but I think basically the "common sense" way that we initially expected things to work.

  1. If we introduce the data_frame property for the vm.Filter component, and if the data_frame Parameters change the targeting form to data_manager_key.function_argument, does is mean that our dynamic Filters could be handled in a same way as any other dynamic figure components?

Not sure I understand this, please could you explain?


Basically I think roughly ordered priorities are:

  1. Finish this PR
  2. Fix Component properties set through dynamic callbacks cannot be persisted plotly/dash#2678
  3. Tidy our code, see what remaining cases remain to fix
  4. DFP to update filters without reloading page
  5. @antonymilne completes some current work coming from [Tidy] Convert actions to classes #363 on class-based action and refactoring
  6. Implement "Select ALL" for the multi=True categorical selectors. https://github.com/McK-Internal/vizro-internal/issues/1342#issue-2618580976

wdyt?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Vizro-AI 🤖 Issue/PR that addresses Vizro-AI package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants