Releases: Substra/substrafl
0.37.0rc1
0.37.0rc1 - 2023-06-12
Added
- ComputePlanBuilder base class to define which method are needed to implement a custom strategy in SubstraFL.
These methods arebuild_compute_plan
,load_local_states
andsave_local_states
. #120 - Check and test on string used as metric name in test data nodes (#122).
- Add default exclusion patterns when copying file to avoid creating large Docker images (#118)
- Add the possibility to force the Dependency editable_mode through the environment variable SUBSTRA_FORCE_EDITABLE_MODE (#131)
- Check on the Python version used before generating the Dockerfile ([#123])(#123)).
Changed
-
BREAKING: depreciate the usage of
model_loading.download_algo_files
andmodel_loading.load_algo
functions. New utils functions are now available. (#125)
model_loading.download_algo_state
to download a SubstraFL algo of a given round or rank.
model_loading.download_shared_state
to download a SubstraFL shared object of a given round or rank.
model_loading.download_aggregated_state
to download a SubstraFL aggregated of a given round or rank.
The API change goes from:algo_files_folder = str(pathlib.Path.cwd() / "tmp" / "algo_files") download_algo_files( client=client_to_download_from, compute_plan_key=compute_plan.key, round_idx=round_idx, dest_folder=algo_files_folder, ) model = load_algo(input_folder=algo_files_folder).model
to
algo = download_algo_state( client=client_to_download_from , compute_plan_key=compute_plan.key, round_idx=round_idx, ) model = algo.model
-
BREAKING: rename
build_graph
tobuild_compute_plan
. (#120) -
BREAKING: move
schema.py
into thestrategy
module. (#120)from substrafl.schemas import FedAvgSharedState # Become from substrafl.strategies.schemas import FedAvgSharedState
-
Way to copy function files (#118)
-
download_train_task_models_by_rank
uses new functionlist_task_output_assets
instead of usingvalue
that has been removed (#129) -
Python dependencies are resolved using pip compile during function registration ([#123])(#123)).
-
BREAKING: local_dependencies is renamed local_installable_dependencies([#123])(#123)).
-
BREAKING: local_installable_dependencies are now limited to local modules or Python wheels (no support for bdist, sdist...)([#123])(#123)).
Fixed
- New dependencies copy method in Docker mode.(#130)
0.36.0
Fixed
- Close issue #114. Large batch size are set to the number of samples in predict for NR and FedPCA. (#115)
Changed
-
BREAKING: Metrics are now given as
metric_functions
and not asmetric_key
. The functions given as metric functions to test data nodes are automatically registered in a new Substra function by SubstraFL. (#117).
The new argument of the TestDataNode classmetric_functions
replaces themetric_keys
one and accepts a dictionary (using the key as the identifier of the function given as value), a list of functions or directly a function if there is only one metric to compute (function.__name__
is then used as identifier).
Installed dependencies are thealgo_dependencies
passed toexecute_experiment
, and permissions are the same as the predict function.From a user point of view, the metric registration changes from:
def accuracy(datasamples, predictions_path): y_true = datasamples["labels"] y_pred = np.load(predictions_path) return accuracy_score(y_true, np.argmax(y_pred, axis=1)) metric_deps = Dependency(pypi_dependencies=["numpy==1.23.1", "scikit-learn==1.1.1"]) permissions_metric = Permissions(public=False, authorized_ids=DATA_PROVIDER_ORGS_ID) metric_key = add_metric( client=client, metric_function=accuracy, permissions=permissions_metric, dependencies=metric_deps, ) test_data_nodes = [ TestDataNode( organization_id=org_id, data_manager_key=dataset_keys[org_id], test_data_sample_keys=[test_datasample_keys[org_id]], metric_keys=[metric_key], ) for org_id in DATA_PROVIDER_ORGS_ID ]
to:
def accuracy(datasamples, predictions_path): y_true = datasamples["labels"] y_pred = np.load(predictions_path) return accuracy_score(y_true, np.argmax(y_pred, axis=1)) test_data_nodes = [ TestDataNode( organization_id=org_id, data_manager_key=dataset_keys[org_id], test_data_sample_keys=[test_datasample_keys[org_id]], metric_functions={"Accuracy": accuracy}, ) for org_id in DATA_PROVIDER_ORGS_ID ]
-
Enforce kwargs for user facing function with more than 3 parameters (#109)
-
Remove references to
composite
. Replace bytrain_task
. (#108)
Added
- Add the Federated Principal Component Analysis strategy (#97)
0.36.0rc2
0.36.0rc1
0.35.1
0.35.1rc1
0.35.0
This is a promotion of 0.35.0rc1
Added
- Initialization task to each strategy in SubstraFL. (#89)
This allows to load the Algo
and all its attributes to the platform before any training? Once on the platform, we can perform a testing task before any training.
This init task consists in submitting an empty function, coded in the BaseAlgo
class.
@remote
def initialize(self, shared_states):
return
The init task return a local
output that will be passed as input to a test task, and to the first train task.
The graph pass from:
flowchart LR
TrainTask1_round0--Local-->TestTask1_r0
TrainTask1_round0--Shared-->TestTask1_r0
TrainTask2_round0--Shared-->AggregateTask
TrainTask2_round0--Local-->TestTask2_r0
TrainTask2_round0--Shared-->TestTask2_r0
AggregateTask--Shared-->TrainTask1_r1
TrainTask1_round0--Local-->TrainTask1_r1
AggregateTask--Shared-->TrainTask2_r1
TrainTask2_round0--Local-->TrainTask2_r1
TrainTask1_round0--Shared-->AggregateTask
TrainTask1_r1--Local-->TestTask1_r1
TrainTask1_r1--Shared-->TestTask1_r1
TrainTask2_r1--Local-->TestTask2_r1
TrainTask2_r1--Shared-->TestTask2_r1
to:
flowchart LR
InitTask1_round0--Local-->TestTask1_r0
InitTask2_round0--Local-->TestTask2_r0
InitTask1_round0--Local-->TrainTask1_r1
InitTask2_round0--Local-->TrainTask2_r1
TrainTask2_r1--Shared-->AggregateTask
TrainTask1_r1--Shared-->AggregateTask
TrainTask1_r1--Local-->TestTask1_r1
TrainTask2_r1--Local-->TestTask2_r1
TrainTask1_r1--Local-->TrainTask1_r2
TrainTask2_r1--Local-->TrainTask2_r2
AggregateTask--Shared-->TrainTask1_r2
AggregateTask--Shared-->TrainTask2_r2
TrainTask1_r2--Local-->TestTask1_r2
TrainTask2_r2--Local-->TestTask2_r2
Changed
- BREAKING:
algo
are now passed as parameter to thestrategy
and not toexecute_experiement
anymore (#98) - BREAKING A
strategy
need to implement a new methodbuild_graph
to build the graph of tasks to be execute inexecute_experiment
(#98) - BREAKING:
predict
method ofstrategy
has been renamed toperform_predict
(#98) - Test tasks don't take a
shared
as input anymore (#89) - BREAKING: change
eval_frequency
default value to None to avoid confusion with hidden default value (#91) - BREAKING: rename Algo to Function (#82)
- BREAKING: clarify
EvaluationStrategy
arguments: changerounds
toeval_frequency
andeval_rounds
(#85) - replace
schemas.xxx
bysubstra.schemas.xxx
(#105)
Fixed
- BREAKING: Given local code dependencies are now copied to the level of the running script systematically (#99)
- Docker images are pruned in main check of Github Action to free disk space while test run (#102)
- Pass
aggregation_lr
to the parent class for Scaffold. Fix issue 103 (#104)
Removed
from substra import schemas
inaggregation_node.py
,test_data_node.py
andtrain_data_node.py
(#105)
chore: release 0.35.0rc1
Added
- Initialization task to each strategy in SubstraFL. (#89)
This allows to load the Algo
and all its attributes to the platform before any training? Once on the platform, we can perform a testing task before any training.
This init task consists in submitting an empty function, coded in the BaseAlgo
class.
@remote
def initialize(self, shared_states):
return
The init task return a local
output that will be passed as input to a test task, and to the first train task.
The graph pass from:
flowchart LR
TrainTask1_round0--Local-->TestTask1_r0
TrainTask1_round0--Shared-->TestTask1_r0
TrainTask2_round0--Shared-->AggregateTask
TrainTask2_round0--Local-->TestTask2_r0
TrainTask2_round0--Shared-->TestTask2_r0
AggregateTask--Shared-->TrainTask1_r1
TrainTask1_round0--Local-->TrainTask1_r1
AggregateTask--Shared-->TrainTask2_r1
TrainTask2_round0--Local-->TrainTask2_r1
TrainTask1_round0--Shared-->AggregateTask
TrainTask1_r1--Local-->TestTask1_r1
TrainTask1_r1--Shared-->TestTask1_r1
TrainTask2_r1--Local-->TestTask2_r1
TrainTask2_r1--Shared-->TestTask2_r1
to:
flowchart LR
InitTask1_round0--Local-->TestTask1_r0
InitTask2_round0--Local-->TestTask2_r0
InitTask1_round0--Local-->TrainTask1_r1
InitTask2_round0--Local-->TrainTask2_r1
TrainTask2_r1--Shared-->AggregateTask
TrainTask1_r1--Shared-->AggregateTask
TrainTask1_r1--Local-->TestTask1_r1
TrainTask2_r1--Local-->TestTask2_r1
TrainTask1_r1--Local-->TrainTask1_r2
TrainTask2_r1--Local-->TrainTask2_r2
AggregateTask--Shared-->TrainTask1_r2
AggregateTask--Shared-->TrainTask2_r2
TrainTask1_r2--Local-->TestTask1_r2
TrainTask2_r2--Local-->TestTask2_r2
Changed
- BREAKING:
algo
are now passed as parameter to thestrategy
and not toexecute_experiement
anymore (#98) - BREAKING A
strategy
need to implement a new methodbuild_graph
to build the graph of tasks to be execute inexecute_experiment
(#98) - BREAKING:
predict
method ofstrategy
has been renamed toperform_predict
(#98) - Test tasks don't take a
shared
as input anymore (#89) - BREAKING: change
eval_frequency
default value to None to avoid confusion with hidden default value (#91) - BREAKING: rename Algo to Function (#82)
- BREAKING: clarify
EvaluationStrategy
arguments: changerounds
toeval_frequency
andeval_rounds
(#85) - replace
schemas.xxx
bysubstra.schemas.xxx
(#105)
Fixed
- BREAKING: Given local code dependencies are now copied to the level of the running script systematically (#99)
- Docker images are pruned in main check of Github Action to free disk space while test run (#102)
- Pass
aggregation_lr
to the parent class for Scaffold. Fix issue 103 (#104)
Removed
from substra import schemas
inaggregation_node.py
,test_data_node.py
andtrain_data_node.py
(#105)