diff --git a/vizro-core/changelog.d/20241120_154345_antony.milne_dynamic_filter.md b/vizro-core/changelog.d/20241120_154345_antony.milne_dynamic_filter.md new file mode 100644 index 000000000..f7981a8b4 --- /dev/null +++ b/vizro-core/changelog.d/20241120_154345_antony.milne_dynamic_filter.md @@ -0,0 +1,46 @@ + + +### Highlights ✨ + +- Filters update automatically when underlying dynamic data changes. See the [user guide on dynamic filters](https://vizro.readthedocs.io/en/stable/pages/user-guides/data/#filters) for more information. ([#879](https://github.com/mckinsey/vizro/pull/879)) + + + + + + + diff --git a/vizro-core/docs/pages/user-guides/data.md b/vizro-core/docs/pages/user-guides/data.md index b4fdfeda1..ffd53f2fb 100644 --- a/vizro-core/docs/pages/user-guides/data.md +++ b/vizro-core/docs/pages/user-guides/data.md @@ -179,7 +179,7 @@ Since dynamic data sources must always be added to the data manager and referenc ### Configure cache -By default, each time the dashboard is refreshed a dynamic data function executes again. In fact, if there are multiple graphs on the same page using the same dynamic data source then the loading function executes _multiple_ times, once for each graph on the page. Hence, if loading your data is a slow operation, your dashboard performance may suffer. +By default, a dynamic data function executes every time the dashboard is refreshed. Data loading is batched so that a dynamic data function that supplies multiple graphs on the same page only executes _once_ on page refresh. Even with this batching, if loading your data is still a slow operation, your dashboard performance may suffer. The Vizro data manager has a server-side caching mechanism to help solve this. Vizro's cache uses [Flask-Caching](https://flask-caching.readthedocs.io/en/latest/), which supports a number of possible cache backends and [configuration options](https://flask-caching.readthedocs.io/en/latest/#configuring-flask-caching). By default, the cache is turned off. @@ -220,7 +220,7 @@ By default, when caching is turned on, dynamic data is cached in the data manage If you would like to alter some options, such as the default cache timeout, then you can specify a different cache configuration: -```py title="Simple cache with timeout set to 10 minutes" +```python title="Simple cache with timeout set to 10 minutes" data_manager.cache = Cache(config={"CACHE_TYPE": "SimpleCache", "CACHE_DEFAULT_TIMEOUT": 600}) ``` @@ -268,8 +268,12 @@ data_manager["no_expire_data"].timeout = 0 ### Parametrize data loading -You can supply arguments to your dynamic data loading function that can be modified from the dashboard. -For example, if you are handling big data then you can use an argument to specify the number of entries or size of chunk of data. +You can give arguments to your dynamic data loading function that can be modified from the dashboard. For example: + +- To load different versions of the same data. +- To handle big data you can use an argument that controls the amount of data that is loaded. This effectively pre-filters data before it reaches the Vizro dashboard. + +In general, a parametrized dynamic data source should always return a pandas DataFrame with a fixed schema (column names and types). This ensures that page components and controls continue to work as expected when the parameter is changed on screen. To add a parameter to control a dynamic data source, do the following: @@ -277,7 +281,7 @@ To add a parameter to control a dynamic data source, do the following: 2. give an `id` to all components that have the data source you wish to alter through a parameter. 3. [add a parameter](parameters.md) with `targets` of the form `.data_frame.` and a suitable [selector](selectors.md). -For example, let us extend the [dynamic data example](#dynamic-data) above to show how the `load_iris_data` can take an argument `number_of_points` controlled from the dashboard with a [`Slider`][vizro.models.Slider]. +For example, let us extend the [dynamic data example](#dynamic-data) above into a simple toy example of how parametrized dynamic data works. The `load_iris_data` can take an argument `number_of_points` controlled from the dashboard with a [`Slider`][vizro.models.Slider]. !!! example "Parametrized dynamic data" === "app.py" @@ -333,14 +337,78 @@ Parametrized data loading is compatible with [caching](#configure-cache). The ca You cannot pass [nested parameters](parameters.md#nested-parameters) to dynamic data. You can only target the top-level arguments of the data loading function, not the nested keys in a dictionary. -### Filter update limitation +### Filters + +When a [filter](filters.md) depends on dynamic data and no `selector` is explicitly defined in the `vm.Filter` model, the available selector values update on page refresh to reflect the latest dynamic data. This is called a _dynamic filter_. + +The mechanism for dynamic filters, including caching, works exactly like other non-control components such as `vm.Graph`. However, unlike such components, a filter can depend on multiple data sources. If at least one data source of the components in the filter's `targets` is dynamic then the filter is dynamic. Remember that when `targets` is not explicitly specified, a filter applies to all the components on a page that use a DataFrame including `column`. + +When the page is refreshed, the behaviour of a dynamic filter is as follows: + +- The filter's selector updates its available values: + - For [categorical selectors](selectors.md#categorical-selectors), `options` updates to give all unique values found in `column` across all the data sources of components in `targets`. + - For [numerical selectors](selectors.md#numerical-selectors), `min` and `max` update to give the overall minimum and maximum values found in `column` across all the data sources of components in `targets`. +- The value selected on screen by a dashboard user _does not_ change. If the selected value is not present in the new set of available values then it is still selected, but the filtering operation might result in an empty DataFrame. +- Even though the values present in a data source can change, the schema should not: `column` should remain present and of the same type in the data sources. The `targets` of the filter and selector type cannot change while the dashboard is running. For example, a `vm.Dropdown` selector cannot turn into `vm.RadioItems`. + +For example, let us add two filters to the [dynamic data example](#dynamic-data) above: + +!!! example "Dynamic filters" + + ```py hl_lines="10 20 21" + from vizro import Vizro + import pandas as pd + import vizro.plotly.express as px + import vizro.models as vm + + from vizro.managers import data_manager + + def load_iris_data(): + iris = pd.read_csv("iris.csv") + return iris.sample(5) # (1)! + + data_manager["iris"] = load_iris_data -If your dashboard includes a [filter](filters.md) then the values shown on a filter's [selector](selectors.md) _do not_ update while the dashboard is running. This is a known limitation that will be lifted in future releases, but if is problematic for you already then [raise an issue on our GitHub repo](https://github.com/mckinsey/vizro/issues/). + page = vm.Page( + title="Update the chart and filters on page refresh", + components=[ + vm.Graph(figure=px.box("iris", x="species", y="petal_width", color="species")) + ], + controls=[ + vm.Filter(column="species"), # (2)! + vm.Filter(column="sepal_length"), # (3)! + ], + ) -This limitation is why all arguments of your dynamic data loading function must have a default value. Regardless of the value of the `vm.Parameter` selected in the dashboard, these default parameter values are used when the `vm.Filter` is built. This determines the type of selector used in a filter and the options shown, which cannot currently be changed while the dashboard is running. + dashboard = vm.Dashboard(pages=[page]) -Although a selector is automatically chosen for you in a filter when your dashboard is built, remember that [you can change this choice](filters.md#changing-selectors). For example, we could ensure that a dropdown always contains the options "setosa", "versicolor" and "virginica" by explicitly specifying your filter as follows. + Vizro().build(dashboard).run() + ``` -```py -vm.Filter(column="species", selector=vm.Dropdown(options=["setosa", "versicolor", "virginica"]) + 1. We sample only 5 rather than 50 points so that changes to the available values in the filtered columns are more apparent when the page is refreshed. + 2. This filter implicitly controls the dynamic data source `"iris"`, which supplies the `data_frame` to the targeted `vm.Graph`. On page refresh, Vizro reloads this data, finds all the unique values in the `"species"` column and sets the categorical selector's `options` accordingly. + 3. Similarly, on page refresh, Vizro finds the minimum and maximum values of the `"sepal_length"` column in the reloaded data and sets new `min` and `max` values for the numerical selector accordingly. + +If you have a filter that depends on dynamic data but do not want the available values to change when the dynamic data changes then you should manually specify the `selector`'s `options` field (categorical selector) or `min` and `max` fields (numerical selector). In the above example, this could be achieved as follows: + +```python title="Override selector options to make a dynamic filter static" +controls = [ + vm.Filter(column="species", selector=vm.Dropdown(options=["setosa", "versicolor", "virginica"])), + vm.Filter(column="sepal_length", selector=vm.RangeSlider(min=4.3, max=7.9)), +] +``` + +If you [use a specific selector](filters.md#change-selector) for a dynamic filter without manually specifying `options` (categorical selector) or `min` and `max` (numerical selector) then the selector remains dynamic. For example: + +```python title="Dynamic filter with specific selector is still dynamic" +controls = [ + vm.Filter(column="species", selector=vm.Checklist()), + vm.Filter(column="sepal_length", selector=vm.Slider()), +] ``` + +When Vizro initially builds a filter that depends on parametrized dynamic data loading, data is loaded using the default argument values. This data is used to perform initial validation, check which data sources contain the specified `column` (unless `targets` is explicitly specified) and determine the type of selector to use (unless `selector` is explicitly specified). + +!!! note + + When a dynamic data parameter is changed on screen, the data underlying a dynamic filter can change. Currently this change affects page components such as `vm.Graph` but does not affect the available values shown in a dynamic filter, which only update on page refresh. This functionality will be coming soon! diff --git a/vizro-core/docs/pages/user-guides/filters.md b/vizro-core/docs/pages/user-guides/filters.md index bad355fab..0680c0bca 100644 --- a/vizro-core/docs/pages/user-guides/filters.md +++ b/vizro-core/docs/pages/user-guides/filters.md @@ -3,8 +3,9 @@ This guide shows you how to add filters to your dashboard. One main way to interact with the charts/components on your page is by filtering the underlying data. A filter selects a subset of rows of a component's underlying DataFrame which alters the appearance of that component on the page. The [`Page`][vizro.models.Page] model accepts the `controls` argument, where you can enter a [`Filter`][vizro.models.Filter] model. -This model enables the automatic creation of [selectors](../user-guides/selectors.md) (such as Dropdown, RadioItems, Slider, ...) that operate upon the charts/components on the screen. +This model enables the automatic creation of [selectors](selectors.md) (such as `Dropdown`, `RadioItems`, `Slider`, ...) that operate on the charts/components on the screen. +By default, filters that control components with [dynamic data](data.md#dynamic-data) are [dynamically updated](data.md#filters) when the underlying data changes while the dashboard is running. ## Basic filters @@ -13,8 +14,7 @@ To add a filter to your page, do the following: 1. add the [`Filter`][vizro.models.Filter] model into the `controls` argument of the [`Page`][vizro.models.Page] model 2. configure the `column` argument, which denotes the target column to be filtered -By default, all components on a page with such a `column` present will be filtered. The selector type will be chosen -automatically based on the target column, for example, a dropdown for categorical data, a range slider for numerical data, or a date picker for temporal data. +You can also set `targets` to specify which components on the page should be affected by the filter. If this is not explicitly set then `targets` defaults to all components on the page whose data source includes `column`. !!! example "Basic Filter" === "app.py" @@ -63,12 +63,83 @@ automatically based on the target column, for example, a dropdown for categorica [Filter]: ../../assets/user_guides/control/control1.png -## Changing selectors +The selector is configured automatically based on the target column type data as follows: + + - Categorical data uses [`vm.Dropdown(multi=True)`][vizro.models.Dropdown] where `options` is the set of unique values found in `column` across all the data sources of components in `targets`. + - [Numerical data](https://pandas.pydata.org/docs/reference/api/pandas.api.types.is_numeric_dtype.html) uses [`vm.RangeSlider`][vizro.models.RangeSlider] where `min` and `max` are the overall minimum and maximum values found in `column` across all the data sources of components in `targets`. + - [Temporal data](https://pandas.pydata.org/docs/reference/api/pandas.api.types.is_datetime64_any_dtype.html) uses [`vm.DatePicker(range=True)`][vizro.models.DatePicker] where `min` and `max` are the overall minimum and maximum values found in `column` across all the data sources of components in `targets`. A column can be converted to this type with [pandas.to_datetime](https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html). + +Below is an example demonstrating these default selector types. + +!!! example "Default Filter selectors" + === "app.py" + ```{.python pycafe-link} + import pandas as pd + from vizro import Vizro + import vizro.plotly.express as px + import vizro.models as vm + + df_stocks = px.data.stocks(datetimes=True) + + df_stocks_long = pd.melt( + df_stocks, + id_vars='date', + value_vars=['GOOG', 'AAPL', 'AMZN', 'FB', 'NFLX', 'MSFT'], + var_name='stocks', + value_name='value' + ) + + df_stocks_long['value'] = df_stocks_long['value'].round(3) + + page = vm.Page( + title="My first page", + components=[ + vm.Graph(figure=px.line(df_stocks_long, x="date", y="value", color="stocks")), + ], + controls=[ + vm.Filter(column="stocks"), + vm.Filter(column="value"), + vm.Filter(column="date"), + ], + ) + + dashboard = vm.Dashboard(pages=[page]) + + Vizro().build(dashboard).run() + ``` + === "app.yaml" + ```yaml + # Still requires a .py to add data to the data manager and parse YAML configuration + # See yaml_version example + pages: + - components: + - figure: + _target_: line + data_frame: df_stocks_long + x: date + y: value + color: stocks + type: graph + controls: + - column: stocks + type: filter + - column: value + type: filter + - column: date + type: filter + title: My first page + ``` + === "Result" + [![Filter]][Filter] + + [Filter]: ../../assets/user_guides/selectors/default_filter_selectors.png + +## Change selector If you want to have a different selector for your filter, you can give the `selector` argument of the [`Filter`][vizro.models.Filter] a different selector model. Currently available selectors are [`Checklist`][vizro.models.Checklist], [`Dropdown`][vizro.models.Dropdown], [`RadioItems`][vizro.models.RadioItems], [`RangeSlider`][vizro.models.RangeSlider], [`Slider`][vizro.models.Slider], and [`DatePicker`][vizro.models.DatePicker]. -!!! example "Filter with custom Selector" +!!! example "Filter with different selector" === "app.py" ```{.python pycafe-link} from vizro import Vizro @@ -118,11 +189,10 @@ Currently available selectors are [`Checklist`][vizro.models.Checklist], [`Dropd ## Further customization -For further customizations, you can always refer to the [`Filter`][vizro.models.Filter] reference. Some popular choices are: +For further customizations, you can always refer to the [`Filter` model][vizro.models.Filter] reference and the [guide to selectors](selectors.md). Some popular choices are: - select which component the filter will apply to by using `targets` -- select what the target column type is, hence choosing the default selector by using `column_type` -- choose options of lower level components, such as the `selector` models +- specify configuration of the `selector`, for example `multi` to switch between a multi-option and single-option selector, `options` for a categorical filter or `min` and `max` for a numerical filter Below is an advanced example where we only target one page component, and where we further customize the chosen `selector`. @@ -142,7 +212,7 @@ Below is an advanced example where we only target one page component, and where vm.Graph(figure=px.scatter(iris, x="petal_length", y="sepal_width", color="species")), ], controls=[ - vm.Filter(column="petal_length",targets=["scatter_chart"],selector=vm.RangeSlider(step=1)), + vm.Filter(column="petal_length",targets=["scatter_chart"], selector=vm.RangeSlider(step=1)), ], ) @@ -186,3 +256,5 @@ Below is an advanced example where we only target one page component, and where [![Advanced]][Advanced] [Advanced]: ../../assets/user_guides/control/control3.png + +To further customize selectors, see our [how-to-guide on creating custom components](custom-components.md). diff --git a/vizro-core/docs/pages/user-guides/selectors.md b/vizro-core/docs/pages/user-guides/selectors.md index 944515ab6..10481d2ba 100644 --- a/vizro-core/docs/pages/user-guides/selectors.md +++ b/vizro-core/docs/pages/user-guides/selectors.md @@ -53,92 +53,3 @@ For more information, refer to the API reference of the selector, or the documen When the [`DatePicker`][vizro.models.DatePicker] is configured with `range=True` (the default), the underlying component is `dmc.DateRangePicker`. When `range=False` the underlying component is `dmc.DatePicker`. When configuring the [`DatePicker`][vizro.models.DatePicker] make sure to provide your dates for `min`, `max` and `value` arguments in `"yyyy-mm-dd"` format or as `datetime` type (for example, `datetime.datetime(2024, 01, 01)`). - -## Default selectors - -If you don't specify a selector, a default selector is applied based on the data type of the provided column. - -Default selectors for: - - - categorical data: [`Dropdown`][vizro.models.Dropdown] - - numerical data: [`RangeSlider`][vizro.models.RangeSlider] - - temporal data: [`DatePicker(range=True)`][vizro.models.DatePicker] - -Categorical selectors can be used independently of the data type of the column being filtered. - -To use numerical [`Filter`][vizro.models.Filter] selectors, the filtered column must be of `numeric` format, -indicating that [pandas.api.types.is_numeric_dtype()](https://pandas.pydata.org/docs/reference/api/pandas.api.types.is_numeric_dtype.html) must return `True` for the filtered column. - -To use temporal [`Filter`][vizro.models.Filter] selectors, the filtered column must be of `datetime` format, -indicating that [pandas.api.types.is_datetime64_any_dtype()](https://pandas.pydata.org/docs/reference/api/pandas.api.types.is_datetime64_any_dtype.html) must return `True` for the filtered column. - -`pd.DataFrame` column types can be changed to `datetime` using [pandas.to_datetime()](https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html) or - - -### Example of default Filter selectors - -!!! example "Default Filter selectors" - === "app.py" - ```{.python pycafe-link} - import pandas as pd - from vizro import Vizro - import vizro.plotly.express as px - import vizro.models as vm - - df_stocks = px.data.stocks(datetimes=True) - - df_stocks_long = pd.melt( - df_stocks, - id_vars='date', - value_vars=['GOOG', 'AAPL', 'AMZN', 'FB', 'NFLX', 'MSFT'], - var_name='stocks', - value_name='value' - ) - - df_stocks_long['value'] = df_stocks_long['value'].round(3) - - page = vm.Page( - title="My first page", - components=[ - vm.Graph(figure=px.line(df_stocks_long, x="date", y="value", color="stocks")), - ], - controls=[ - vm.Filter(column="stocks"), - vm.Filter(column="value"), - vm.Filter(column="date"), - ], - ) - - dashboard = vm.Dashboard(pages=[page]) - - Vizro().build(dashboard).run() - ``` - === "app.yaml" - ```yaml - # Still requires a .py to add data to the data manager and parse YAML configuration - # See yaml_version example - pages: - - components: - - figure: - _target_: line - data_frame: df_stocks_long - x: date - y: value - color: stocks - type: graph - controls: - - column: stocks - type: filter - - column: value - type: filter - - column: date - type: filter - title: My first page - ``` - === "Result" - [![Filter]][Filter] - - [Filter]: ../../assets/user_guides/selectors/default_filter_selectors.png - - -To enhance existing selectors, see our [how-to-guide on creating custom components](custom-components.md).