Readmission Prediction#
This notebook showcases readmission prediction on the Diabetes 130-US Hospitals for Years 1999-2008 using CyclOps. The task is formulated as a binary classification task, where we predict the probability of early readmission of the patient within 30 days of discharge.
Install libraries#
[1]:
!pip install pycyclops
!pip install ucimlrepo
Requirement already satisfied: pycyclops in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (0.2.10)
Requirement already satisfied: Jinja2<4.0.0,>=3.1.3 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (3.1.4)
Requirement already satisfied: array-api-compat==1.6 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (1.6)
Requirement already satisfied: datasets<3.0.0,>=2.15.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (2.19.0)
Requirement already satisfied: hydra-core<2.0.0,>=1.2.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (1.3.2)
Requirement already satisfied: kaleido==0.2.1 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (0.2.1)
Requirement already satisfied: matplotlib<4.0.0,>=3.8.3 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (3.8.3)
Requirement already satisfied: numpy<2.0.0,>=1.24.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (1.24.4)
Requirement already satisfied: pandas<3.0,>=2.1 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pandas[performance]<3.0,>=2.1->pycyclops) (2.1.4)
Requirement already satisfied: pillow<11.0.0,>=10.0.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (10.3.0)
Requirement already satisfied: plotly<6.0.0,>=5.7.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (5.18.0)
Requirement already satisfied: psutil<6.0.0,>=5.9.4 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (5.9.7)
Requirement already satisfied: pyarrow<15.0.0,>=14.0.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (14.0.2)
Requirement already satisfied: pybtex<0.25.0,>=0.24.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (0.24.0)
Requirement already satisfied: pydantic<2.0.0,>=1.10.11 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (1.10.17)
Requirement already satisfied: scikit-learn<2.0.0,>=1.4.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (1.5.0)
Requirement already satisfied: scipy<2.0.0,>=1.11.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (1.13.0rc1)
Requirement already satisfied: scour<0.39.0,>=0.38.2 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (0.38.2)
Requirement already satisfied: spdx-tools<0.9.0,>=0.8.1 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pycyclops) (0.8.2)
Requirement already satisfied: filelock in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from datasets<3.0.0,>=2.15.0->pycyclops) (3.13.1)
Requirement already satisfied: pyarrow-hotfix in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from datasets<3.0.0,>=2.15.0->pycyclops) (0.6)
Requirement already satisfied: dill<0.3.9,>=0.3.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from datasets<3.0.0,>=2.15.0->pycyclops) (0.3.7)
Requirement already satisfied: requests>=2.19.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from datasets<3.0.0,>=2.15.0->pycyclops) (2.32.3)
Requirement already satisfied: tqdm>=4.62.1 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from datasets<3.0.0,>=2.15.0->pycyclops) (4.66.4)
Requirement already satisfied: xxhash in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from datasets<3.0.0,>=2.15.0->pycyclops) (3.4.1)
Requirement already satisfied: multiprocess in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from datasets<3.0.0,>=2.15.0->pycyclops) (0.70.15)
Requirement already satisfied: fsspec<=2024.3.1,>=2023.1.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from fsspec[http]<=2024.3.1,>=2023.1.0->datasets<3.0.0,>=2.15.0->pycyclops) (2023.10.0)
Requirement already satisfied: aiohttp in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from datasets<3.0.0,>=2.15.0->pycyclops) (3.9.5)
Requirement already satisfied: huggingface-hub>=0.21.2 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from datasets<3.0.0,>=2.15.0->pycyclops) (0.22.2)
Requirement already satisfied: packaging in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from datasets<3.0.0,>=2.15.0->pycyclops) (23.2)
Requirement already satisfied: pyyaml>=5.1 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from datasets<3.0.0,>=2.15.0->pycyclops) (6.0.1)
Requirement already satisfied: omegaconf<2.4,>=2.2 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from hydra-core<2.0.0,>=1.2.0->pycyclops) (2.3.0)
Requirement already satisfied: antlr4-python3-runtime==4.9.* in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from hydra-core<2.0.0,>=1.2.0->pycyclops) (4.9.3)
Requirement already satisfied: MarkupSafe>=2.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from Jinja2<4.0.0,>=3.1.3->pycyclops) (2.1.3)
Requirement already satisfied: contourpy>=1.0.1 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from matplotlib<4.0.0,>=3.8.3->pycyclops) (1.1.0)
Requirement already satisfied: cycler>=0.10 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from matplotlib<4.0.0,>=3.8.3->pycyclops) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from matplotlib<4.0.0,>=3.8.3->pycyclops) (4.47.0)
Requirement already satisfied: kiwisolver>=1.3.1 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from matplotlib<4.0.0,>=3.8.3->pycyclops) (1.4.5)
Requirement already satisfied: pyparsing>=2.3.1 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from matplotlib<4.0.0,>=3.8.3->pycyclops) (3.1.1)
Requirement already satisfied: python-dateutil>=2.7 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from matplotlib<4.0.0,>=3.8.3->pycyclops) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pandas<3.0,>=2.1->pandas[performance]<3.0,>=2.1->pycyclops) (2023.3.post1)
Requirement already satisfied: tzdata>=2022.1 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pandas<3.0,>=2.1->pandas[performance]<3.0,>=2.1->pycyclops) (2023.3)
Requirement already satisfied: bottleneck>=1.3.4 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pandas[performance]<3.0,>=2.1->pycyclops) (1.3.8)
Requirement already satisfied: numba>=0.55.2 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pandas[performance]<3.0,>=2.1->pycyclops) (0.57.1)
Requirement already satisfied: numexpr>=2.8.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pandas[performance]<3.0,>=2.1->pycyclops) (2.10.0)
Requirement already satisfied: tenacity>=6.2.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from plotly<6.0.0,>=5.7.0->pycyclops) (8.2.3)
Requirement already satisfied: latexcodec>=1.0.4 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pybtex<0.25.0,>=0.24.0->pycyclops) (2.0.1)
Requirement already satisfied: six in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pybtex<0.25.0,>=0.24.0->pycyclops) (1.16.0)
Requirement already satisfied: typing-extensions>=4.2.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pydantic<2.0.0,>=1.10.11->pycyclops) (4.9.0)
Requirement already satisfied: joblib>=1.2.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from scikit-learn<2.0.0,>=1.4.0->pycyclops) (1.3.2)
Requirement already satisfied: threadpoolctl>=3.1.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from scikit-learn<2.0.0,>=1.4.0->pycyclops) (3.2.0)
Requirement already satisfied: click in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from spdx-tools<0.9.0,>=0.8.1->pycyclops) (8.1.7)
Requirement already satisfied: xmltodict in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from spdx-tools<0.9.0,>=0.8.1->pycyclops) (0.13.0)
Requirement already satisfied: rdflib in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from spdx-tools<0.9.0,>=0.8.1->pycyclops) (7.0.0)
Requirement already satisfied: beartype in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from spdx-tools<0.9.0,>=0.8.1->pycyclops) (0.16.4)
Requirement already satisfied: uritools in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from spdx-tools<0.9.0,>=0.8.1->pycyclops) (4.0.2)
Requirement already satisfied: license-expression in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from spdx-tools<0.9.0,>=0.8.1->pycyclops) (30.2.0)
Requirement already satisfied: ply in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from spdx-tools<0.9.0,>=0.8.1->pycyclops) (3.11)
Requirement already satisfied: semantic-version in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from spdx-tools<0.9.0,>=0.8.1->pycyclops) (2.10.0)
Requirement already satisfied: aiosignal>=1.1.2 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from aiohttp->datasets<3.0.0,>=2.15.0->pycyclops) (1.3.1)
Requirement already satisfied: attrs>=17.3.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from aiohttp->datasets<3.0.0,>=2.15.0->pycyclops) (23.1.0)
Requirement already satisfied: frozenlist>=1.1.1 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from aiohttp->datasets<3.0.0,>=2.15.0->pycyclops) (1.4.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from aiohttp->datasets<3.0.0,>=2.15.0->pycyclops) (6.0.4)
Requirement already satisfied: yarl<2.0,>=1.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from aiohttp->datasets<3.0.0,>=2.15.0->pycyclops) (1.9.4)
Requirement already satisfied: async-timeout<5.0,>=4.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from aiohttp->datasets<3.0.0,>=2.15.0->pycyclops) (4.0.3)
Requirement already satisfied: llvmlite<0.41,>=0.40.0dev0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from numba>=0.55.2->pandas[performance]<3.0,>=2.1->pycyclops) (0.40.1)
Requirement already satisfied: charset-normalizer<4,>=2 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from requests>=2.19.0->datasets<3.0.0,>=2.15.0->pycyclops) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from requests>=2.19.0->datasets<3.0.0,>=2.15.0->pycyclops) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from requests>=2.19.0->datasets<3.0.0,>=2.15.0->pycyclops) (2.2.2)
Requirement already satisfied: certifi>=2017.4.17 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from requests>=2.19.0->datasets<3.0.0,>=2.15.0->pycyclops) (2024.7.4)
Requirement already satisfied: boolean.py>=4.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from license-expression->spdx-tools<0.9.0,>=0.8.1->pycyclops) (4.0)
Requirement already satisfied: isodate<0.7.0,>=0.6.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from rdflib->spdx-tools<0.9.0,>=0.8.1->pycyclops) (0.6.1)
[notice] A new release of pip is available: 24.0 -> 24.1.2
[notice] To update, run: pip install --upgrade pip
Collecting ucimlrepo
Using cached ucimlrepo-0.0.7-py3-none-any.whl.metadata (5.5 kB)
Requirement already satisfied: pandas>=1.0.0 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from ucimlrepo) (2.1.4)
Requirement already satisfied: certifi>=2020.12.5 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from ucimlrepo) (2024.7.4)
Requirement already satisfied: numpy<2,>=1.22.4 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pandas>=1.0.0->ucimlrepo) (1.24.4)
Requirement already satisfied: python-dateutil>=2.8.2 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pandas>=1.0.0->ucimlrepo) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pandas>=1.0.0->ucimlrepo) (2023.3.post1)
Requirement already satisfied: tzdata>=2022.1 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from pandas>=1.0.0->ucimlrepo) (2023.3)
Requirement already satisfied: six>=1.5 in /mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages (from python-dateutil>=2.8.2->pandas>=1.0.0->ucimlrepo) (1.16.0)
Using cached ucimlrepo-0.0.7-py3-none-any.whl (8.0 kB)
Installing collected packages: ucimlrepo
Successfully installed ucimlrepo-0.0.7
[notice] A new release of pip is available: 24.0 -> 24.1.2
[notice] To update, run: pip install --upgrade pip
Import Libraries#
[2]:
"""Readmission prediction."""
# ruff: noqa: E402
import copy
import inspect
from datetime import date
import numpy as np
import pandas as pd
import plotly.express as px
from datasets import Dataset
from datasets.features import ClassLabel
from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import MinMaxScaler, OneHotEncoder
from ucimlrepo import fetch_ucirepo
from cyclops.data.df.feature import TabularFeatures
from cyclops.data.slicer import SliceSpec
from cyclops.evaluate.fairness import FairnessConfig # noqa: E402
from cyclops.evaluate.metrics import create_metric
from cyclops.evaluate.metrics.experimental.functional import (
binary_npv,
binary_ppv,
binary_roc,
)
from cyclops.evaluate.metrics.experimental.metric_dict import MetricDict
from cyclops.models.catalog import create_model
from cyclops.report import ModelCardReport
from cyclops.report.plot.classification import ClassificationPlotter
from cyclops.report.utils import flatten_results_dict
from cyclops.tasks import BinaryTabularClassificationTask
/mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
CyclOps offers a package for documentation of the model through a model report. The ModelCardReport
class is used to populate and generate the model report as an HTML file. The model report has the following sections:
Overview: Provides a high level overview of how the model is doing (a quick glance of important metrics), and how it is doing over time (performance over several metrics and subgroups over time).
Datasets: High level statistics of the training data, including changes in distribution over time.
Quantitative Analysis: This section contains additional detailed performance metrics of the model for different sets of the data and subpopulations.
Fairness Analysis: This section contains the fairness metrics of the model.
Model Details: This section contains descriptive metadata about the model such as the owners, version, license, etc.
Model Parameters: This section contains the technical details of the model such as the model architecture, training parameters, etc.
Considerations: This section contains descriptions of the considerations involved in developing and using the model such as the intended use, limitations, etc.
We will use this to document the model development process as we go along and generate the model report at the end.
The model report tool is a work in progress and is subject to change.
[3]:
report = ModelCardReport()
Constants#
[4]:
RANDOM_SEED = 85
NAN_THRESHOLD = 0.75
TRAIN_SIZE = 0.8
EVAL_NUM = 3
Data Loading#
[5]:
diabetes_130_data = fetch_ucirepo(id=296)
features = diabetes_130_data["data"]["features"]
targets = diabetes_130_data["data"]["targets"]
metadata = diabetes_130_data["metadata"]
variables = diabetes_130_data["variables"]
[6]:
metadata
[6]:
{'uci_id': 296,
'name': 'Diabetes 130-US Hospitals for Years 1999-2008',
'repository_url': 'https://archive.ics.uci.edu/dataset/296/diabetes+130-us+hospitals+for+years+1999-2008',
'data_url': 'https://archive.ics.uci.edu/static/public/296/data.csv',
'abstract': 'The dataset represents ten years (1999-2008) of clinical care at 130 US hospitals and integrated delivery networks. Each row concerns hospital records of patients diagnosed with diabetes, who underwent laboratory, medications, and stayed up to 14 days. The goal is to determine the early readmission of the patient within 30 days of discharge.\nThe problem is important for the following reasons. Despite high-quality evidence showing improved clinical outcomes for diabetic patients who receive various preventive and therapeutic interventions, many patients do not receive them. This can be partially attributed to arbitrary diabetes management in hospital environments, which fail to attend to glycemic control. Failure to provide proper diabetes care not only increases the managing costs for the hospitals (as the patients are readmitted) but also impacts the morbidity and mortality of the patients, who may face complications associated with diabetes.\n',
'area': 'Health and Medicine',
'tasks': ['Classification', 'Clustering'],
'characteristics': ['Multivariate'],
'num_instances': 101766,
'num_features': 47,
'feature_types': ['Categorical', 'Integer'],
'demographics': ['Race', 'Gender', 'Age'],
'target_col': ['readmitted'],
'index_col': ['encounter_id', 'patient_nbr'],
'has_missing_values': 'yes',
'missing_values_symbol': 'NaN',
'year_of_dataset_creation': 2014,
'last_updated': 'Mon Feb 26 2024',
'dataset_doi': '10.24432/C5230J',
'creators': ['John Clore', 'Krzysztof Cios', 'Jon DeShazo', 'Beata Strack'],
'intro_paper': {'title': 'Impact of HbA1c Measurement on Hospital Readmission Rates: Analysis of 70,000 Clinical Database Patient Record',
'authors': 'Beata Strack, Jonathan DeShazo, Chris Gennings, Juan Olmo, Sebastian Ventura, Krzysztof Cios, John Clore',
'published_in': 'BioMed Research International, vol. 2014',
'year': 2014,
'url': 'https://www.hindawi.com/journals/bmri/2014/781670/',
'doi': None},
'additional_info': {'summary': 'The dataset represents ten years (1999-2008) of clinical care at 130 US hospitals and integrated delivery networks. It includes over 50 features representing patient and hospital outcomes. Information was extracted from the database for encounters that satisfied the following criteria.\n(1)\tIt is an inpatient encounter (a hospital admission).\n(2)\tIt is a diabetic encounter, that is, one during which any kind of diabetes was entered into the system as a diagnosis.\n(3)\tThe length of stay was at least 1 day and at most 14 days.\n(4)\tLaboratory tests were performed during the encounter.\n(5)\tMedications were administered during the encounter.\n\nThe data contains such attributes as patient number, race, gender, age, admission type, time in hospital, medical specialty of admitting physician, number of lab tests performed, HbA1c test result, diagnosis, number of medications, diabetic medications, number of outpatient, inpatient, and emergency visits in the year before the hospitalization, etc.',
'purpose': None,
'funded_by': None,
'instances_represent': 'The instances represent hospitalized patient records diagnosed with diabetes.',
'recommended_data_splits': 'No recommendation. The standard train-test split could be used. Can use three-way holdout split (i.e., train-validation-test) when doing model selection.',
'sensitive_data': 'Yes. The dataset contains information about the age, gender, and race of the patients.',
'preprocessing_description': None,
'variable_info': 'Detailed description of all the atrributes is provided in Table 1 Beata Strack, Jonathan P. DeShazo, Chris Gennings, Juan L. Olmo, Sebastian Ventura, Krzysztof J. Cios, and John N. Clore, “Impact of HbA1c Measurement on Hospital Readmission Rates: Analysis of 70,000 Clinical Database Patient Records,” BioMed Research International, vol. 2014, Article ID 781670, 11 pages, 2014.\n\nhttp://www.hindawi.com/journals/bmri/2014/781670/',
'citation': 'Please cite:\nBeata Strack, Jonathan P. DeShazo, Chris Gennings, Juan L. Olmo, Sebastian Ventura, Krzysztof J. Cios, and John N. Clore, “Impact of HbA1c Measurement on Hospital Readmission Rates: Analysis of 70,000 Clinical Database Patient Records,” BioMed Research International, vol. 2014, Article ID 781670, 11 pages, 2014.'}}
[7]:
def transform_label(value):
"""Transform string labels of readmission into 0/1 binary labels.
Parameters
----------
value: str
Input value
Returns
-------
int
0 if not readmitted or if greater than 30 days, 1 if less than 30 days
"""
if value in ["NO", ">30"]:
return 0
if value == "<30":
return 1
raise ValueError("Unexpected value for readmission!")
df = features
targets["readmitted"] = targets["readmitted"].apply(transform_label)
df["readmitted"] = targets
Choose a small subset for modelling
[8]:
df = df[0:1000]
Remove features that are NaNs or have just a single unique value
[9]:
features_to_remove = []
for col in df:
if len(df[col].value_counts()) <= 1:
features_to_remove.append(col)
df = df.drop(columns=features_to_remove)
Sex values#
[10]:
fig = px.pie(df, names="gender")
fig.update_layout(
title="Gender Distribution",
)
fig.show()
Add the figure to the report
[11]:
report.log_plotly_figure(
fig=fig,
caption="Gender Distribution",
section_name="datasets",
)
Age distribution#
[12]:
fig = px.histogram(df, x="age")
fig.update_layout(
title="Age Distribution",
xaxis_title="Age",
yaxis_title="Count",
bargap=0.2,
)
fig.show()
Add the figure to the report
[13]:
report.log_plotly_figure(
fig=fig,
caption="Age Distribution",
section_name="datasets",
)
Outcome distribution#
[14]:
df["outcome"] = df["readmitted"].astype("int")
df = df.drop(columns=["readmitted"])
[15]:
fig = px.pie(df, names="outcome")
fig.update_traces(textinfo="percent+label")
fig.update_layout(title_text="Outcome Distribution")
fig.update_traces(
hovertemplate="Outcome: %{label}<br>Count: \
%{value}<br>Percent: %{percent}",
)
fig.show()
Add the figure to the report
[16]:
report.log_plotly_figure(
fig=fig,
caption="Outcome Distribution",
section_name="datasets",
)
[17]:
class_counts = df["outcome"].value_counts()
class_ratio = class_counts[0] / class_counts[1]
print(class_ratio, class_counts)
9.638297872340425 outcome
0 906
1 94
Name: count, dtype: int64
From the features in the dataset, we select all of them to train the model!
[18]:
features_list = list(df.columns)
features_list.remove("outcome")
features_list = sorted(features_list)
Identifying feature types#
Cyclops TabularFeatures
class helps to identify feature types, an essential step before preprocessing the data. Understanding feature types (numerical/categorical/binary) allows us to apply appropriate preprocessing steps for each type.
[19]:
tab_features = TabularFeatures(
data=df.reset_index(),
features=features_list,
by="index",
targets="outcome",
)
print(tab_features.types)
{'tolazamide': 'binary', 'insulin': 'ordinal', 'pioglitazone': 'ordinal', 'admission_source_id': 'ordinal', 'tolbutamide': 'binary', 'number_emergency': 'ordinal', 'A1Cresult': 'ordinal', 'time_in_hospital': 'ordinal', 'num_procedures': 'ordinal', 'number_outpatient': 'ordinal', 'age': 'ordinal', 'race': 'ordinal', 'rosiglitazone': 'ordinal', 'acarbose': 'binary', 'diag_3': 'string', 'gender': 'binary', 'number_diagnoses': 'ordinal', 'glyburide': 'ordinal', 'admission_type_id': 'ordinal', 'medical_specialty': 'string', 'repaglinide': 'ordinal', 'num_lab_procedures': 'numeric', 'glipizide': 'ordinal', 'troglitazone': 'binary', 'discharge_disposition_id': 'ordinal', 'glimepiride': 'ordinal', 'outcome': 'binary', 'change': 'binary', 'diag_1': 'string', 'max_glu_serum': 'ordinal', 'metformin': 'ordinal', 'diabetesMed': 'binary', 'number_inpatient': 'ordinal', 'diag_2': 'string', 'num_medications': 'numeric'}
Creating data preprocessors#
We create a data preprocessor using sklearn’s ColumnTransformer. This helps in applying different preprocessing steps to different columns in the dataframe. For instance, binary features might be processed differently from numeric features.
[20]:
numeric_transformer = Pipeline(
steps=[("imputer", SimpleImputer(strategy="mean")), ("scaler", MinMaxScaler())],
)
binary_transformer = Pipeline(
steps=[("imputer", SimpleImputer(strategy="most_frequent"))],
)
[21]:
numeric_features = sorted((tab_features.features_by_type("numeric")))
numeric_indices = [
df[features_list].columns.get_loc(column) for column in numeric_features
]
print(numeric_features)
['num_lab_procedures', 'num_medications']
[22]:
binary_features = sorted(tab_features.features_by_type("binary"))
binary_features.remove("outcome")
ordinal_features = sorted(
tab_features.features_by_type("ordinal")
+ ["medical_specialty", "diag_1", "diag_2", "diag_3"]
)
binary_indices = [
df[features_list].columns.get_loc(column) for column in binary_features
]
ordinal_indices = [
df[features_list].columns.get_loc(column) for column in ordinal_features
]
print(binary_features, ordinal_features)
['acarbose', 'change', 'diabetesMed', 'gender', 'tolazamide', 'tolbutamide', 'troglitazone'] ['A1Cresult', 'admission_source_id', 'admission_type_id', 'age', 'diag_1', 'diag_2', 'diag_3', 'discharge_disposition_id', 'glimepiride', 'glipizide', 'glyburide', 'insulin', 'max_glu_serum', 'medical_specialty', 'metformin', 'num_procedures', 'number_diagnoses', 'number_emergency', 'number_inpatient', 'number_outpatient', 'pioglitazone', 'race', 'repaglinide', 'rosiglitazone', 'time_in_hospital']
[23]:
preprocessor = ColumnTransformer(
transformers=[
("num", numeric_transformer, numeric_indices),
(
"onehot",
OneHotEncoder(handle_unknown="ignore", sparse_output=False),
binary_indices + ordinal_indices,
),
],
remainder="passthrough",
)
Let’s document the dataset in the model card. This can be done using the log_dataset
method, which takes the following arguments: - description: A description of the dataset. - citation: The citation for the dataset. - link: A link to a resource for the dataset. - license_id: The SPDX license identifier for the dataset. - version: The version of the dataset. - features: A list of features in the dataset. - split: The split of the dataset (train, test, validation, etc.). - sensitive_features:
A list of sensitive features used to train/evaluate the model. - sensitive_feature_justification: A justification for the sensitive features used to train/evaluate the model.
[24]:
report.log_dataset(
description=metadata["abstract"],
citation=inspect.cleandoc(
"""
@article{strack2014impact,
title={Impact of HbA1c measurement on hospital readmission rates: analysis of 70,000 clinical database patient records},
author={Strack, Beata and DeShazo, Jonathan P and Gennings, Chris and Olmo, Juan L and Ventura, Sebastian and Cios, Krzysztof J and Clore, John N and others},
journal={BioMed research international},
volume={2014},
year={2014},
publisher={Hindawi}
}
""",
),
link=metadata["repository_url"],
license_id="CC0-1.0",
version="Version 1",
features=features_list,
sensitive_features=["gender", "age", "race"],
sensitive_feature_justification="Demographic information like age and gender \
often have a strong correlation with health outcomes. For example, older \
patients are more likely to have a higher risk of readmission.",
)
Creating Hugging Face Dataset#
We convert our processed Pandas dataframe into a Hugging Face dataset, a powerful and easy-to-use data format which is also compatible with CyclOps models and evaluator modules. The dataset is then split to train and test sets.
[25]:
dataset = Dataset.from_pandas(df)
dataset.cleanup_cache_files()
print(dataset)
Dataset({
features: ['race', 'gender', 'age', 'admission_type_id', 'discharge_disposition_id', 'admission_source_id', 'time_in_hospital', 'medical_specialty', 'num_lab_procedures', 'num_procedures', 'num_medications', 'number_outpatient', 'number_emergency', 'number_inpatient', 'diag_1', 'diag_2', 'diag_3', 'number_diagnoses', 'max_glu_serum', 'A1Cresult', 'metformin', 'repaglinide', 'glimepiride', 'glipizide', 'glyburide', 'tolbutamide', 'pioglitazone', 'rosiglitazone', 'acarbose', 'troglitazone', 'tolazamide', 'insulin', 'change', 'diabetesMed', 'outcome'],
num_rows: 1000
})
[26]:
dataset = dataset.cast_column("outcome", ClassLabel(num_classes=2))
dataset = dataset.train_test_split(
train_size=TRAIN_SIZE,
stratify_by_column="outcome",
seed=RANDOM_SEED,
)
- Casting the dataset: 0%| | 0/1000 [00:00<?, ? examples/s]
</pre>
- Casting the dataset: 0%| | 0/1000 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Casting the dataset: 0%| | 0/1000 [00:00<?, ? examples/s]
- Casting the dataset: 100%|██████████| 1000/1000 [00:00<00:00, 39264.05 examples/s]
</pre>
- Casting the dataset: 100%|██████████| 1000/1000 [00:00<00:00, 39264.05 examples/s]
end{sphinxVerbatim}
Casting the dataset: 100%|██████████| 1000/1000 [00:00<00:00, 39264.05 examples/s]
Model Creation#
CyclOps model registry allows for straightforward creation and selection of models. This registry maintains a list of pre-configured models, which can be instantiated with a single line of code. Here we use a SGD classifier to fit a logisitic regression model. The model configurations can be passed to create_model
based on the sklearn parameters for SGDClassifier.
[27]:
model_name = "xgb_classifier"
model = create_model(model_name, random_state=123)
Task Creation#
We use Cyclops tasks to define our model’s task (in this case, readmission prediction), train the model, make predictions, and evaluate performance. Cyclops task classes encapsulate the entire ML pipeline into a single, cohesive structure, making the process smooth and easy to manage.
[28]:
readmission_prediction_task = BinaryTabularClassificationTask(
{model_name: model},
task_features=features_list,
task_target="outcome",
)
[29]:
readmission_prediction_task.list_models()
[29]:
['xgb_classifier']
Training#
If best_model_params
is passed to the train
method, the best model will be selected after the hyperparameter search. The parameters in best_model_params
indicate the values to create the parameters grid.
Note that the data preprocessor needs to be passed to the tasks methods if the Hugging Face dataset is not already preprocessed.
[30]:
best_model_params = {
"n_estimators": [100, 250, 500],
"learning_rate": [0.1, 0.01],
"max_depth": [2, 5],
"reg_lambda": [0, 1, 10],
"colsample_bytree": [0.7, 0.8, 1],
"gamma": [0, 1, 2, 10],
"method": "random",
"scale_pos_weight": [int(class_ratio)],
}
readmission_prediction_task.train(
dataset["train"],
model_name=model_name,
transforms=preprocessor,
best_model_params=best_model_params,
)
2024-07-16 17:01:39,759 INFO cyclops.models.wrappers.sk_model - Best scale_pos_weight: 9
2024-07-16 17:01:39,767 INFO cyclops.models.wrappers.sk_model - Best reg_lambda: 1
2024-07-16 17:01:39,768 INFO cyclops.models.wrappers.sk_model - Best n_estimators: 250
2024-07-16 17:01:39,769 INFO cyclops.models.wrappers.sk_model - Best max_depth: 5
2024-07-16 17:01:39,770 INFO cyclops.models.wrappers.sk_model - Best learning_rate: 0.1
2024-07-16 17:01:39,773 INFO cyclops.models.wrappers.sk_model - Best gamma: 0
2024-07-16 17:01:39,774 INFO cyclops.models.wrappers.sk_model - Best colsample_bytree: 0.7
[30]:
XGBClassifier(base_score=None, booster=None, callbacks=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=0.7, early_stopping_rounds=None, enable_categorical=False, eval_metric='logloss', feature_types=None, gamma=0, gpu_id=None, grow_policy=None, importance_type=None, interaction_constraints=None, learning_rate=0.1, max_bin=None, max_cat_threshold=None, max_cat_to_onehot=None, max_delta_step=None, max_depth=5, max_leaves=None, min_child_weight=3, missing=nan, monotone_constraints=None, n_estimators=250, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=123, ...)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
XGBClassifier(base_score=None, booster=None, callbacks=None, colsample_bylevel=None, colsample_bynode=None, colsample_bytree=0.7, early_stopping_rounds=None, enable_categorical=False, eval_metric='logloss', feature_types=None, gamma=0, gpu_id=None, grow_policy=None, importance_type=None, interaction_constraints=None, learning_rate=0.1, max_bin=None, max_cat_threshold=None, max_cat_to_onehot=None, max_delta_step=None, max_depth=5, max_leaves=None, min_child_weight=3, missing=nan, monotone_constraints=None, n_estimators=250, n_jobs=None, num_parallel_tree=None, predictor=None, random_state=123, ...)
[31]:
model_params = readmission_prediction_task.list_models_params()[model_name]
print(model_params)
{'objective': 'binary:logistic', 'use_label_encoder': None, 'base_score': None, 'booster': None, 'callbacks': None, 'colsample_bylevel': None, 'colsample_bynode': None, 'colsample_bytree': 0.7, 'early_stopping_rounds': None, 'enable_categorical': False, 'eval_metric': 'logloss', 'feature_types': None, 'gamma': 0, 'gpu_id': None, 'grow_policy': None, 'importance_type': None, 'interaction_constraints': None, 'learning_rate': 0.1, 'max_bin': None, 'max_cat_threshold': None, 'max_cat_to_onehot': None, 'max_delta_step': None, 'max_depth': 5, 'max_leaves': None, 'min_child_weight': 3, 'missing': nan, 'monotone_constraints': None, 'n_estimators': 250, 'n_jobs': None, 'num_parallel_tree': None, 'predictor': None, 'random_state': 123, 'reg_alpha': None, 'reg_lambda': 1, 'sampling_method': None, 'scale_pos_weight': 9, 'subsample': None, 'tree_method': None, 'validate_parameters': None, 'verbosity': None, 'seed': 123}
Log the model parameters to the report.
We can add model parameters to the model card using the log_model_parameters
method.
[32]:
report.log_model_parameters(params=model_params)
Prediction#
The prediction output can be either the whole Hugging Face dataset with the prediction columns added to it or the single column containing the predicted values.
[33]:
y_pred = readmission_prediction_task.predict(
dataset["test"],
model_name=model_name,
transforms=preprocessor,
proba=True,
only_predictions=True,
)
prediction_df = pd.DataFrame(
{
"y_prob": [y_pred_i[1] for y_pred_i in y_pred],
"y_true": dataset["test"]["outcome"],
"gender": dataset["test"]["gender"],
}
)
- Map: 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Map: 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Map: 0%| | 0/200 [00:00<?, ? examples/s]
- Map: 100%|██████████| 200/200 [00:00<00:00, 1081.50 examples/s]
</pre>
- Map: 100%|██████████| 200/200 [00:00<00:00, 1081.50 examples/s]
end{sphinxVerbatim}
Map: 100%|██████████| 200/200 [00:00<00:00, 1081.50 examples/s]
- Map: 100%|██████████| 200/200 [00:00<00:00, 1024.67 examples/s]
</pre>
- Map: 100%|██████████| 200/200 [00:00<00:00, 1024.67 examples/s]
end{sphinxVerbatim}
Map: 100%|██████████| 200/200 [00:00<00:00, 1024.67 examples/s]
Evaluation#
Evaluation is done using various evaluation metrics that provide different perspectives on the model’s predictive abilities i.e. standard performance metrics and fairness metrics.
The standard performance metrics can be created using the MetricDict
object.
[34]:
metric_names = [
"binary_accuracy",
"binary_precision",
"binary_recall",
"binary_f1_score",
"binary_auroc",
"binary_average_precision",
"binary_roc_curve",
"binary_precision_recall_curve",
]
metrics = [
create_metric(metric_name, experimental=True) for metric_name in metric_names
]
metric_collection = MetricDict(metrics)
In addition to overall metrics, it might be interesting to see how the model performs on certain subpopulations. We can define these subpopulations using SliceSpec
objects.
[35]:
spec_list = [
{
"age": {
"value": "[50-60)",
},
},
{
"age": {
"value": "[60-70)",
},
},
]
slice_spec = SliceSpec(spec_list)
A MetricDict
can also be defined for the fairness metrics.
[36]:
specificity = create_metric(metric_name="binary_specificity", experimental=True)
sensitivity = create_metric(metric_name="binary_sensitivity", experimental=True)
fpr = -specificity + 1
fnr = -sensitivity + 1
ber = (fpr + fnr) / 2
fairness_metric_collection = MetricDict(
{
"Sensitivity": sensitivity,
"Specificity": specificity,
"BER": ber,
},
)
The FairnessConfig helps in setting up and evaluating the fairness of the model predictions.
[37]:
fairness_config = FairnessConfig(
metrics=fairness_metric_collection,
dataset=None, # dataset is passed from the evaluator
target_columns=None, # target columns are passed from the evaluator
groups=["age"],
group_base_values={"age": "[40-50)"},
thresholds=[0.5],
)
The evaluate methods outputs the evaluation results and the Hugging Face dataset with the predictions added to it.
[38]:
results, dataset_with_preds = readmission_prediction_task.evaluate(
dataset=dataset["test"],
metrics=metric_collection,
model_names=model_name,
transforms=preprocessor,
prediction_column_prefix="preds",
slice_spec=slice_spec,
batch_size=-1,
fairness_config=fairness_config,
override_fairness_metrics=False,
)
- Map: 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Map: 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Map: 0%| | 0/200 [00:00<?, ? examples/s]
- Map: 100%|██████████| 200/200 [00:00<00:00, 1215.69 examples/s]
</pre>
- Map: 100%|██████████| 200/200 [00:00<00:00, 1215.69 examples/s]
end{sphinxVerbatim}
Map: 100%|██████████| 200/200 [00:00<00:00, 1215.69 examples/s]
- Map: 100%|██████████| 200/200 [00:00<00:00, 1147.33 examples/s]
</pre>
- Map: 100%|██████████| 200/200 [00:00<00:00, 1147.33 examples/s]
end{sphinxVerbatim}
Map: 100%|██████████| 200/200 [00:00<00:00, 1147.33 examples/s]
- Flattening the indices: 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Flattening the indices: 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Flattening the indices: 0%| | 0/200 [00:00<?, ? examples/s]
- Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 1276.40 examples/s]
</pre>
- Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 1276.40 examples/s]
end{sphinxVerbatim}
Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 1276.40 examples/s]
- Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 1156.69 examples/s]
</pre>
- Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 1156.69 examples/s]
end{sphinxVerbatim}
Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 1156.69 examples/s]
- Flattening the indices: 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Flattening the indices: 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Flattening the indices: 0%| | 0/200 [00:00<?, ? examples/s]
- Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 26130.29 examples/s]
</pre>
- Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 26130.29 examples/s]
end{sphinxVerbatim}
Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 26130.29 examples/s]
- Map: 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Map: 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Map: 0%| | 0/200 [00:00<?, ? examples/s]
- Map: 100%|██████████| 200/200 [00:00<00:00, 4715.40 examples/s]
</pre>
- Map: 100%|██████████| 200/200 [00:00<00:00, 4715.40 examples/s]
end{sphinxVerbatim}
Map: 100%|██████████| 200/200 [00:00<00:00, 4715.40 examples/s]
- Filter -> age:[50-60): 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Filter -> age:[50-60): 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> age:[50-60): 0%| | 0/200 [00:00<?, ? examples/s]
- Filter -> age:[50-60): 100%|██████████| 200/200 [00:00<00:00, 12238.64 examples/s]
</pre>
- Filter -> age:[50-60): 100%|██████████| 200/200 [00:00<00:00, 12238.64 examples/s]
end{sphinxVerbatim}
Filter -> age:[50-60): 100%|██████████| 200/200 [00:00<00:00, 12238.64 examples/s]
- Filter -> age:[60-70): 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Filter -> age:[60-70): 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> age:[60-70): 0%| | 0/200 [00:00<?, ? examples/s]
- Filter -> age:[60-70): 100%|██████████| 200/200 [00:00<00:00, 15355.03 examples/s]
</pre>
- Filter -> age:[60-70): 100%|██████████| 200/200 [00:00<00:00, 15355.03 examples/s]
end{sphinxVerbatim}
Filter -> age:[60-70): 100%|██████████| 200/200 [00:00<00:00, 15355.03 examples/s]
- Filter -> overall: 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Filter -> overall: 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> overall: 0%| | 0/200 [00:00<?, ? examples/s]
- Filter -> overall: 100%|██████████| 200/200 [00:00<00:00, 10276.38 examples/s]
</pre>
- Filter -> overall: 100%|██████████| 200/200 [00:00<00:00, 10276.38 examples/s]
end{sphinxVerbatim}
Filter -> overall: 100%|██████████| 200/200 [00:00<00:00, 10276.38 examples/s]
- Filter -> age:[50-60): 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Filter -> age:[50-60): 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> age:[50-60): 0%| | 0/200 [00:00<?, ? examples/s]
- Filter -> age:[50-60): 100%|██████████| 200/200 [00:00<00:00, 15577.44 examples/s]
</pre>
- Filter -> age:[50-60): 100%|██████████| 200/200 [00:00<00:00, 15577.44 examples/s]
end{sphinxVerbatim}
Filter -> age:[50-60): 100%|██████████| 200/200 [00:00<00:00, 15577.44 examples/s]
- Filter -> age:[70-80): 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Filter -> age:[70-80): 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> age:[70-80): 0%| | 0/200 [00:00<?, ? examples/s]
- Filter -> age:[70-80): 100%|██████████| 200/200 [00:00<00:00, 16274.97 examples/s]
</pre>
- Filter -> age:[70-80): 100%|██████████| 200/200 [00:00<00:00, 16274.97 examples/s]
end{sphinxVerbatim}
Filter -> age:[70-80): 100%|██████████| 200/200 [00:00<00:00, 16274.97 examples/s]
- Filter -> age:[60-70): 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Filter -> age:[60-70): 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> age:[60-70): 0%| | 0/200 [00:00<?, ? examples/s]
- Filter -> age:[60-70): 100%|██████████| 200/200 [00:00<00:00, 16846.96 examples/s]
</pre>
- Filter -> age:[60-70): 100%|██████████| 200/200 [00:00<00:00, 16846.96 examples/s]
end{sphinxVerbatim}
Filter -> age:[60-70): 100%|██████████| 200/200 [00:00<00:00, 16846.96 examples/s]
- Filter -> age:[80-90): 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Filter -> age:[80-90): 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> age:[80-90): 0%| | 0/200 [00:00<?, ? examples/s]
- Filter -> age:[80-90): 100%|██████████| 200/200 [00:00<00:00, 16541.01 examples/s]
</pre>
- Filter -> age:[80-90): 100%|██████████| 200/200 [00:00<00:00, 16541.01 examples/s]
end{sphinxVerbatim}
Filter -> age:[80-90): 100%|██████████| 200/200 [00:00<00:00, 16541.01 examples/s]
- Filter -> age:[40-50): 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Filter -> age:[40-50): 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> age:[40-50): 0%| | 0/200 [00:00<?, ? examples/s]
- Filter -> age:[40-50): 100%|██████████| 200/200 [00:00<00:00, 16883.58 examples/s]
</pre>
- Filter -> age:[40-50): 100%|██████████| 200/200 [00:00<00:00, 16883.58 examples/s]
end{sphinxVerbatim}
Filter -> age:[40-50): 100%|██████████| 200/200 [00:00<00:00, 16883.58 examples/s]
- Filter -> age:[90-100): 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Filter -> age:[90-100): 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> age:[90-100): 0%| | 0/200 [00:00<?, ? examples/s]
- Filter -> age:[90-100): 100%|██████████| 200/200 [00:00<00:00, 15595.98 examples/s]
</pre>
- Filter -> age:[90-100): 100%|██████████| 200/200 [00:00<00:00, 15595.98 examples/s]
end{sphinxVerbatim}
Filter -> age:[90-100): 100%|██████████| 200/200 [00:00<00:00, 15595.98 examples/s]
- Filter -> age:[20-30): 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Filter -> age:[20-30): 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> age:[20-30): 0%| | 0/200 [00:00<?, ? examples/s]
- Filter -> age:[20-30): 100%|██████████| 200/200 [00:00<00:00, 16911.83 examples/s]
</pre>
- Filter -> age:[20-30): 100%|██████████| 200/200 [00:00<00:00, 16911.83 examples/s]
end{sphinxVerbatim}
Filter -> age:[20-30): 100%|██████████| 200/200 [00:00<00:00, 16911.83 examples/s]
- Filter -> age:[30-40): 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Filter -> age:[30-40): 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> age:[30-40): 0%| | 0/200 [00:00<?, ? examples/s]
- Filter -> age:[30-40): 100%|██████████| 200/200 [00:00<00:00, 16218.65 examples/s]
</pre>
- Filter -> age:[30-40): 100%|██████████| 200/200 [00:00<00:00, 16218.65 examples/s]
end{sphinxVerbatim}
Filter -> age:[30-40): 100%|██████████| 200/200 [00:00<00:00, 16218.65 examples/s]
- Filter -> age:[10-20): 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Filter -> age:[10-20): 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> age:[10-20): 0%| | 0/200 [00:00<?, ? examples/s]
- Filter -> age:[10-20): 100%|██████████| 200/200 [00:00<00:00, 16293.94 examples/s]
</pre>
- Filter -> age:[10-20): 100%|██████████| 200/200 [00:00<00:00, 16293.94 examples/s]
end{sphinxVerbatim}
Filter -> age:[10-20): 100%|██████████| 200/200 [00:00<00:00, 16293.94 examples/s]
- Filter -> age:[0-10): 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Filter -> age:[0-10): 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> age:[0-10): 0%| | 0/200 [00:00<?, ? examples/s]
- Filter -> age:[0-10): 100%|██████████| 200/200 [00:00<00:00, 16121.40 examples/s]
</pre>
- Filter -> age:[0-10): 100%|██████████| 200/200 [00:00<00:00, 16121.40 examples/s]
end{sphinxVerbatim}
Filter -> age:[0-10): 100%|██████████| 200/200 [00:00<00:00, 16121.40 examples/s]
[39]:
results_female, _ = readmission_prediction_task.evaluate(
dataset=dataset["test"],
metrics=MetricDict(
{
"BinaryAccuracy": create_metric(
metric_name="binary_accuracy",
experimental=True,
),
},
),
model_names=model_name,
transforms=preprocessor,
prediction_column_prefix="preds",
slice_spec=SliceSpec([{"gender": {"value": "Female"}}], include_overall=False),
batch_size=-1,
)
- Map: 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Map: 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Map: 0%| | 0/200 [00:00<?, ? examples/s]
- Map: 100%|██████████| 200/200 [00:00<00:00, 1168.29 examples/s]
</pre>
- Map: 100%|██████████| 200/200 [00:00<00:00, 1168.29 examples/s]
end{sphinxVerbatim}
Map: 100%|██████████| 200/200 [00:00<00:00, 1168.29 examples/s]
- Map: 100%|██████████| 200/200 [00:00<00:00, 1107.71 examples/s]
</pre>
- Map: 100%|██████████| 200/200 [00:00<00:00, 1107.71 examples/s]
end{sphinxVerbatim}
Map: 100%|██████████| 200/200 [00:00<00:00, 1107.71 examples/s]
- Flattening the indices: 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Flattening the indices: 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Flattening the indices: 0%| | 0/200 [00:00<?, ? examples/s]
- Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 1291.38 examples/s]
</pre>
- Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 1291.38 examples/s]
end{sphinxVerbatim}
Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 1291.38 examples/s]
- Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 1179.11 examples/s]
</pre>
- Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 1179.11 examples/s]
end{sphinxVerbatim}
Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 1179.11 examples/s]
- Flattening the indices: 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Flattening the indices: 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Flattening the indices: 0%| | 0/200 [00:00<?, ? examples/s]
- Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 5207.28 examples/s]
</pre>
- Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 5207.28 examples/s]
end{sphinxVerbatim}
Flattening the indices: 100%|██████████| 200/200 [00:00<00:00, 5207.28 examples/s]
- Map: 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Map: 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Map: 0%| | 0/200 [00:00<?, ? examples/s]
- Map: 100%|██████████| 200/200 [00:00<00:00, 4848.35 examples/s]
</pre>
- Map: 100%|██████████| 200/200 [00:00<00:00, 4848.35 examples/s]
end{sphinxVerbatim}
Map: 100%|██████████| 200/200 [00:00<00:00, 4848.35 examples/s]
- Filter -> gender:Female: 0%| | 0/200 [00:00<?, ? examples/s]
</pre>
- Filter -> gender:Female: 0%| | 0/200 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> gender:Female: 0%| | 0/200 [00:00<?, ? examples/s]
- Filter -> gender:Female: 100%|██████████| 200/200 [00:00<00:00, 11459.22 examples/s]
</pre>
- Filter -> gender:Female: 100%|██████████| 200/200 [00:00<00:00, 11459.22 examples/s]
end{sphinxVerbatim}
Filter -> gender:Female: 100%|██████████| 200/200 [00:00<00:00, 11459.22 examples/s]
Log the performance metrics to the report.
We can add a performance metric to the model card using the log_performance_metric
method, which expects a dictionary where the keys are in the following format: slice_name/metric_name
. For instance, overall/accuracy
.
We first need to process the evaluation results to get the metrics in the right format.
[40]:
model_name = f"model_for_preds.{model_name}"
results_flat = flatten_results_dict(
results=results,
remove_metrics=["BinaryROC", "BinaryPrecisionRecallCurve"],
model_name=model_name,
)
results_female_flat = flatten_results_dict(
results=results_female,
model_name=model_name,
)
# ruff: noqa: W505
for name, metric in results_female_flat.items():
split, name = name.split("/") # noqa: PLW2901
descriptions = {
"BinaryPrecision": "The proportion of predicted positive instances that are correctly predicted.",
"BinaryRecall": "The proportion of actual positive instances that are correctly predicted. Also known as recall or true positive rate.",
"BinaryAccuracy": "The proportion of all instances that are correctly predicted.",
"BinaryAUROC": "The area under the receiver operating characteristic curve (AUROC) is a measure of the performance of a binary classification model.",
"BinaryAveragePrecision": "The area under the precision-recall curve (AUPRC) is a measure of the performance of a binary classification model.",
"BinaryF1Score": "The harmonic mean of precision and recall.",
}
report.log_quantitative_analysis(
"performance",
name=name,
value=metric.tolist(),
description=descriptions[name],
metric_slice=split,
pass_fail_thresholds=0.7,
pass_fail_threshold_fns=lambda x, threshold: bool(x >= threshold),
)
for name, metric in results_flat.items():
split, name = name.split("/") # noqa: PLW2901
descriptions = {
"BinaryPrecision": "The proportion of predicted positive instances that are correctly predicted.",
"BinaryRecall": "The proportion of actual positive instances that are correctly predicted. Also known as recall or true positive rate.",
"BinaryAccuracy": "The proportion of all instances that are correctly predicted.",
"BinaryAUROC": "The area under the receiver operating characteristic curve (AUROC) is a measure of the performance of a binary classification model.",
"BinaryAveragePrecision": "The area under the precision-recall curve (AUPRC) is a measure of the performance of a binary classification model.",
"BinaryF1Score": "The harmonic mean of precision and recall.",
}
report.log_quantitative_analysis(
"performance",
name=name,
value=metric.tolist(),
description=descriptions[name],
metric_slice=split,
pass_fail_thresholds=0.7,
pass_fail_threshold_fns=lambda x, threshold: bool(x >= threshold),
)
We can also use the ClassificationPlotter
to plot the performance metrics and the add the figure to the model card using the log_plotly_figure
method.
[41]:
plotter = ClassificationPlotter(task_type="binary", class_names=["0", "1"])
plotter.set_template("plotly_white")
[42]:
# extracting the ROC curves and AUROC results for all the slices
roc_curves = {
slice_name: slice_results["BinaryROC"]
for slice_name, slice_results in results[model_name].items()
}
aurocs = {
slice_name: slice_results["BinaryAUROC"]
for slice_name, slice_results in results[model_name].items()
}
roc_curves.keys()
[42]:
dict_keys(['age:[50-60)', 'age:[60-70)', 'overall'])
[43]:
# plotting the ROC curves for all the slices
roc_plot = plotter.roc_curve_comparison(roc_curves, aurocs=aurocs)
report.log_plotly_figure(
fig=roc_plot,
caption="ROC Curve for Female Patients",
section_name="quantitative analysis",
)
roc_plot.show()
[44]:
# extracting the precision-recall curves and average precision results for all the slices
pr_curves = {
slice_name: slice_results["BinaryPrecisionRecallCurve"]
for slice_name, slice_results in results[model_name].items()
}
average_precisions = {
slice_name: slice_results["BinaryAveragePrecision"]
for slice_name, slice_results in results[model_name].items()
}
pr_curves.keys()
[44]:
dict_keys(['age:[50-60)', 'age:[60-70)', 'overall'])
[45]:
# plotting the precision-recall curves for all the slices
pr_plot = plotter.precision_recall_curve_comparison(
pr_curves,
auprcs=average_precisions,
)
report.log_plotly_figure(
fig=pr_plot,
caption="Precision-Recall Curve Comparison",
section_name="quantitative analysis",
)
pr_plot.show()
[46]:
# Extracting the overall classification metric values.
overall_performance = {
metric_name: metric_value
for metric_name, metric_value in results[model_name]["overall"].items()
if metric_name not in ["BinaryROC", "BinaryPrecisionRecallCurve"]
}
[47]:
# Plotting the overall classification metric values.
overall_performance_plot = plotter.metrics_value(
overall_performance,
title="Overall Performance",
)
report.log_plotly_figure(
fig=overall_performance_plot,
caption="Overall Performance",
section_name="quantitative analysis",
)
overall_performance_plot.show()
[48]:
# Extracting the metric values for all the slices.
slice_metrics = {
slice_name: {
metric_name: metric_value
for metric_name, metric_value in slice_results.items()
if metric_name not in ["BinaryROC", "BinaryPrecisionRecallCurve"]
}
for slice_name, slice_results in results[model_name].items()
}
[49]:
# Plotting the metric values for all the slices.
slice_metrics_plot = plotter.metrics_comparison_bar(slice_metrics)
report.log_plotly_figure(
fig=slice_metrics_plot,
caption="Slice Metric Comparison",
section_name="quantitative analysis",
)
slice_metrics_plot.show()
[50]:
# Plotting the metric values for all the slices.
# ROC curve components
pred_probs = np.array(dataset_with_preds["preds.xgb_classifier"])
true_labels = np.array(dataset_with_preds["outcome"])
roc_curve = binary_roc(true_labels, pred_probs)
ppv = np.zeros_like(roc_curve.thresholds)
npv = np.zeros_like(roc_curve.thresholds)
# Calculate PPV and NPV for each threshold
for i, threshold in enumerate(roc_curve.thresholds):
# Calculate PPV and NPV
ppv[i] = binary_ppv(true_labels, pred_probs, threshold=threshold)
npv[i] = binary_npv(true_labels, pred_probs, threshold=threshold)
runway_plot = plotter.threshperf(roc_curve, ppv, npv, pred_probs)
report.log_plotly_figure(
fig=runway_plot,
caption="Threshold-Performance plot",
section_name="quantitative analysis",
)
runway_plot.show()
We can also plot the calibration curve of the model on the test set
[51]:
calibration_plot = plotter.calibration(
prediction_df, y_true_col="y_true", y_prob_col="y_prob", group_col="gender"
)
report.log_plotly_figure(
fig=calibration_plot,
caption="Calibration plot",
section_name="quantitative analysis",
)
calibration_plot.show()
[52]:
# Reformatting the fairness metrics
fairness_results = copy.deepcopy(results["fairness"])
fairness_metrics = {}
# remove the group size from the fairness results and add it to the slice name
for slice_name, slice_results in fairness_results.items():
group_size = slice_results.pop("Group Size")
fairness_metrics[f"{slice_name} (Size={group_size})"] = slice_results
[53]:
# Plotting the fairness metrics
fairness_plot = plotter.metrics_comparison_scatter(
fairness_metrics,
title="Fairness Metrics",
)
report.log_plotly_figure(
fig=fairness_plot,
caption="Fairness Metrics",
section_name="fairness analysis",
)
fairness_plot.show()
Report Generation#
Before generating the model card, let us document some of the details of the model and some considerations involved in developing and using the model.
Let’s start with populating the model details section, which includes the following fields by default: - description: A high-level description of the model and its usage for a general audience. - version: The version of the model. - owners: The individuals or organizations that own the model. - license: The license under which the model is made available. - citation: The citation for the model. - references: Links to resources that are relevant to the model. - path: The path to where the model is stored. - regulatory_requirements: The regulatory requirements that are relevant to the model.
We can add additional fields to the model details section by passing a dictionary to the log_from_dict
method and specifying the section name as model_details
. You can also use the log_descriptor
method to add a new field object with a description
attribute to any section of the model card.
[54]:
report.log_from_dict(
data={
"name": "Readmission Prediction Model",
"description": "The model was trained on the Diabetes 130-US Hospitals for Years 1999-2008 \
dataset to predict risk of readmission within 30 days of discharge.",
},
section_name="model_details",
)
report.log_version(
version_str="0.0.1",
date=str(date.today()),
description="Initial Release",
)
report.log_owner(
name="CyclOps Team",
contact="vectorinstitute.github.io/cyclops/",
email="cyclops@vectorinstitute.ai",
)
report.log_license(identifier="Apache-2.0")
report.log_reference(
link="https://xgboost.readthedocs.io/en/stable/python/python_api.html", # noqa: E501
)
Next, let’s populate the considerations section, which includes the following fields by default: - users: The intended users of the model. - use_cases: The use cases for the model. These could be primary, downstream or out-of-scope use cases. - fairness_assessment: A description of the benefits and harms of the model for different groups as well as the steps taken to mitigate the harms. - ethical_considerations: The risks associated with using the model and the steps taken to mitigate them. This
can be populated using the log_risk
method.
[55]:
report.log_from_dict(
data={
"users": [
{"description": "Hospitals"},
{"description": "Clinicians"},
],
},
section_name="considerations",
)
report.log_user(description="ML Engineers")
report.log_use_case(
description="Predicting risk of readmission.",
kind="primary",
)
report.log_use_case(
description="Predicting risk of pathologies and conditions other\
than risk of readmission.",
kind="out-of-scope",
)
report.log_fairness_assessment(
affected_group="sex, age",
benefit="Improved health outcomes for patients.",
harm="Biased predictions for patients in certain groups (e.g. older patients) \
may lead to worse health outcomes.",
mitigation_strategy="We will monitor the performance of the model on these groups \
and retrain the model if the performance drops below a certain threshold.",
)
report.log_risk(
risk="The model may be used to make decisions that affect the health of patients.",
mitigation_strategy="The model should be continuously monitored for performance \
and retrained if the performance drops below a certain threshold.",
)
Once the model card is populated, you can generate the report using the export
method. The report is generated in the form of an HTML file. A JSON file containing the model card data will also be generated along with the HTML file. By default, the files will be saved in a folder named cyclops_reports
in the current working directory. You can change the path by passing a output_dir
argument when instantiating the ModelCardReport
class.
[56]:
synthetic_timestamps = pd.date_range(
start="20/6/2024", periods=3, freq="W"
).values.astype(str)
report_path = report.export(
output_filename="readmission_report_periodic.html",
synthetic_timestamp=synthetic_timestamps[EVAL_NUM - 1],
last_n_evals=3,
)
You can view the generated HTML report.