Example project using ArgusEyes in an automated CI workflow using GitHub Actions.
ArgusEyes is a system which allows data scientists to declaratively specify a variety of pipeline issues that they are concerned about. Subsequently, ArgusEyes can instrument, execute and screen the pipeline for the configured pipeline issues, as part of continuous integration processes. ArgusEyes detects complex issues by tracking record-level provenance and understanding the semantics of operations in ML pipelines. ArgusEyes was presented as an abstract at CIDR'22.
We provide three example scenarios (Note that you have to locally install ArgusEyes first to execute them). You can run ArgusEyes to execute the pipeline and screen it for a particular issue issue. Subsequently, you can use an interactive notebook to determine the root cause of the pipeline issue and fix it.
- Source code of the ML pipeline mlinspect-computervision-sneakers.py
- Screening configuration: mlinspect-computervision-sneakers-labelerrors.yaml
- Github workflow run detecting the label errors
- Manual screening:
./eyes-local mlinspect-computervision-sneakers-labelerrors.yaml
- Notebook for retrospective debugging: retrospective_labelerrors.ipynb
- Source code of the ML pipeline mlflow-regression-nyctaxifare.py
- Screening configuration: mlflow-regression-nyctaxifare-dataleakage.yaml
- Github workflow run detecting the leakage
- Manual screening:
./eyes-local mlflow-regression-nyctaxifare-dataleakage.yaml
- Notebook for retrospective debugging: retrospective_dataleakage.ipynb
- Source code of the ML pipeline openml-classification-incomelevel.py
- Screening configuration: openml-classification-incomelevel-fairness.yaml
- Github workflow run detecting the fairness violation
- Manual screening:
./eyes-local openml-classification-incomelevel-fairness.yaml
- Notebook for retrospective debugging: retrospective_fairnessviolation.ipynb