Chest X-Ray Disease Classification#
This notebook shows chest x-ray classification on the NIH dataset using a pretrained model from the TorchXRayVision library and CyclOps to generate a model card.
Import Libraries#
[1]:
"""Chest X-ray Disease Classification."""
import shutil
from functools import partial
import numpy as np
import plotly.express as px
from torchvision.transforms import Compose
from torchxrayvision.models import DenseNet
from cyclops.data.loader import load_nihcxr
from cyclops.data.slicer import (
SliceSpec,
filter_value, # noqa: E402
)
from cyclops.data.transforms import Lambdad, Resized
from cyclops.data.utils import apply_transforms
from cyclops.evaluate import evaluator
from cyclops.evaluate.metrics.factory import create_metric
from cyclops.models.wrappers import PTModel
from cyclops.report import ModelCardReport
/mnt/data/actions_runners/cyclops-actions-runner-1/_work/cyclops/cyclops/.venv/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
Generate Historical Reports#
CyclOps offers a package for documentation of the model through a model report. The ModelCardReport
class is used to populate and generate the model report as an HTML file. The model report has the following sections:
Overview: Provides a high level overview of how the model is doing (a quick glance of important metrics), and how it is doing over time (performance over several metrics and subgroups over time).
Datasets: High level statistics of the training data, including changes in distribution over time.
Quantitative Analysis: This section contains additional detailed performance metrics of the model for different sets of the data and subpopulations.
Fairness Analysis: This section contains the fairness metrics of the model.
Model Details: This section contains descriptive metadata about the model such as the owners, version, license, etc.
Model Parameters: This section contains the technical details of the model such as the model architecture, training parameters, etc.
Considerations: This section contains descriptions of the considerations involved in developing and using the model such as the intended use, limitations, etc.
We will use this to document the model development process as we go along and generate the model report at the end.
The model report tool is a work in progress and is subject to change.
Initialize Periodic Report#
[3]:
report = ModelCardReport()
Load Dataset#
[4]:
data_dir = "/mnt/data/clinical_datasets/NIHCXR"
nih_ds = load_nihcxr(data_dir)["test"]
nih_ds = nih_ds.select(range(1000))
transforms = Compose(
[
Resized(
keys=("image",),
spatial_size=(224, 224),
allow_missing_keys=True,
),
Lambdad(
keys=("image",),
func=lambda x: ((2 * (x / 255.0)) - 1.0) * 1024,
allow_missing_keys=True,
),
Lambdad(
keys=("image",),
func=lambda x: x[0][np.newaxis, :] if x.shape[0] != 1 else x,
allow_missing_keys=True,
),
],
)
Model Creation#
[5]:
model = PTModel(DenseNet(weights="densenet121-res224-nih"))
model.initialize()
nih_ds = model.predict(
nih_ds,
feature_columns=["image"],
transforms=partial(apply_transforms, transforms=transforms),
model_name="densenet",
)
# remove any rows with No Finding == 1
nih_ds = nih_ds.filter(
partial(filter_value, column_name="No Finding", value=1, negate=True),
batched=True,
)
# remove the No Finding column and adjust the predictions to account for it
nih_ds = nih_ds.map(
lambda x: {
"predictions.densenet": x["predictions.densenet"][:14],
},
remove_columns=["No Finding"],
)
print(nih_ds.features)
- Filter: 0%| | 0/1000 [00:00<?, ? examples/s]
</pre>
- Filter: 0%| | 0/1000 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter: 0%| | 0/1000 [00:00<?, ? examples/s]
- Filter: 100%|██████████| 1000/1000 [00:00<00:00, 49411.02 examples/s]
</pre>
- Filter: 100%|██████████| 1000/1000 [00:00<00:00, 49411.02 examples/s]
end{sphinxVerbatim}
Filter: 100%|██████████| 1000/1000 [00:00<00:00, 49411.02 examples/s]
- Map: 0%| | 0/661 [00:00<?, ? examples/s]
</pre>
- Map: 0%| | 0/661 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Map: 0%| | 0/661 [00:00<?, ? examples/s]
- Map: 23%|██▎ | 154/661 [00:00<00:00, 1433.14 examples/s]
</pre>
- Map: 23%|██▎ | 154/661 [00:00<00:00, 1433.14 examples/s]
end{sphinxVerbatim}
Map: 23%|██▎ | 154/661 [00:00<00:00, 1433.14 examples/s]
- Map: 46%|████▌ | 301/661 [00:00<00:00, 1449.94 examples/s]
</pre>
- Map: 46%|████▌ | 301/661 [00:00<00:00, 1449.94 examples/s]
end{sphinxVerbatim}
Map: 46%|████▌ | 301/661 [00:00<00:00, 1449.94 examples/s]
- Map: 68%|██████▊ | 448/661 [00:00<00:00, 1453.07 examples/s]
</pre>
- Map: 68%|██████▊ | 448/661 [00:00<00:00, 1453.07 examples/s]
end{sphinxVerbatim}
Map: 68%|██████▊ | 448/661 [00:00<00:00, 1453.07 examples/s]
- Map: 100%|██████████| 661/661 [00:00<00:00, 858.35 examples/s]
</pre>
- Map: 100%|██████████| 661/661 [00:00<00:00, 858.35 examples/s]
end{sphinxVerbatim}
Map: 100%|██████████| 661/661 [00:00<00:00, 858.35 examples/s]
- Map: 100%|██████████| 661/661 [00:00<00:00, 967.82 examples/s]
</pre>
- Map: 100%|██████████| 661/661 [00:00<00:00, 967.82 examples/s]
end{sphinxVerbatim}
Map: 100%|██████████| 661/661 [00:00<00:00, 967.82 examples/s]
{'Image Index': Value(dtype='string', id=None), 'Finding Labels': Value(dtype='string', id=None), 'Follow-up #': Value(dtype='int64', id=None), 'Patient ID': Value(dtype='int64', id=None), 'Patient Age': Value(dtype='int64', id=None), 'Patient Gender': Value(dtype='string', id=None), 'View Position': Value(dtype='string', id=None), 'OriginalImage[Width': Value(dtype='int64', id=None), 'Height]': Value(dtype='int64', id=None), 'OriginalImagePixelSpacing[x': Value(dtype='float64', id=None), 'y]': Value(dtype='float64', id=None), 'Unnamed: 11': Value(dtype='float64', id=None), 'image': Image(mode=None, decode=True, id=None), 'Atelectasis': Value(dtype='float32', id=None), 'Cardiomegaly': Value(dtype='float32', id=None), 'Consolidation': Value(dtype='float32', id=None), 'Edema': Value(dtype='float32', id=None), 'Effusion': Value(dtype='float32', id=None), 'Emphysema': Value(dtype='float32', id=None), 'Fibrosis': Value(dtype='float32', id=None), 'Hernia': Value(dtype='float32', id=None), 'Infiltration': Value(dtype='float32', id=None), 'Mass': Value(dtype='float32', id=None), 'Nodule': Value(dtype='float32', id=None), 'Pleural_Thickening': Value(dtype='float32', id=None), 'Pneumonia': Value(dtype='float32', id=None), 'Pneumothorax': Value(dtype='float32', id=None), '__index_level_0__': Value(dtype='int64', id=None), 'timestamp': Value(dtype='timestamp[ns]', id=None), 'predictions.densenet': Sequence(feature=Value(dtype='float32', id=None), length=-1, id=None)}
Multilabel AUROC by Pathology and Sex#
[6]:
pathologies = model.model.pathologies[:14]
# define the slices
slices = [
{"Patient Gender": {"value": "M"}},
{"Patient Gender": {"value": "F"}},
]
num_labels = len(pathologies)
ppv = create_metric(
metric_name="multilabel_ppv",
experimental=True,
num_labels=num_labels,
average=None,
)
npv = create_metric(
metric_name="multilabel_npv",
experimental=True,
num_labels=num_labels,
average=None,
)
specificity = create_metric(
metric_name="multilabel_specificity",
experimental=True,
num_labels=num_labels,
average=None,
)
sensitivity = create_metric(
metric_name="multilabel_sensitivity",
experimental=True,
num_labels=num_labels,
average=None,
)
# create the slice functions
slice_spec = SliceSpec(spec_list=slices)
nih_eval_results_gender = evaluator.evaluate(
dataset=nih_ds,
metrics=[ppv, npv, sensitivity, specificity],
target_columns=pathologies,
prediction_columns="predictions.densenet",
ignore_columns="image",
slice_spec=slice_spec,
)
- Filter -> Patient Gender:M: 0%| | 0/661 [00:00<?, ? examples/s]
</pre>
- Filter -> Patient Gender:M: 0%| | 0/661 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> Patient Gender:M: 0%| | 0/661 [00:00<?, ? examples/s]
- Filter -> Patient Gender:M: 100%|██████████| 661/661 [00:00<00:00, 53975.18 examples/s]
</pre>
- Filter -> Patient Gender:M: 100%|██████████| 661/661 [00:00<00:00, 53975.18 examples/s]
end{sphinxVerbatim}
Filter -> Patient Gender:M: 100%|██████████| 661/661 [00:00<00:00, 53975.18 examples/s]
- Filter -> Patient Gender:F: 0%| | 0/661 [00:00<?, ? examples/s]
</pre>
- Filter -> Patient Gender:F: 0%| | 0/661 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> Patient Gender:F: 0%| | 0/661 [00:00<?, ? examples/s]
- Filter -> Patient Gender:F: 100%|██████████| 661/661 [00:00<00:00, 53743.94 examples/s]
</pre>
- Filter -> Patient Gender:F: 100%|██████████| 661/661 [00:00<00:00, 53743.94 examples/s]
end{sphinxVerbatim}
Filter -> Patient Gender:F: 100%|██████████| 661/661 [00:00<00:00, 53743.94 examples/s]
- Filter -> overall: 0%| | 0/661 [00:00<?, ? examples/s]
</pre>
- Filter -> overall: 0%| | 0/661 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> overall: 0%| | 0/661 [00:00<?, ? examples/s]
- Filter -> overall: 100%|██████████| 661/661 [00:00<00:00, 61571.35 examples/s]
</pre>
- Filter -> overall: 100%|██████████| 661/661 [00:00<00:00, 61571.35 examples/s]
end{sphinxVerbatim}
Filter -> overall: 100%|██████████| 661/661 [00:00<00:00, 61571.35 examples/s]
Multilabel AUROC by Pathology and Age#
[7]:
# define the slices
slices = [
{"Patient Age": {"min_value": 19, "max_value": 35}},
{"Patient Age": {"min_value": 35, "max_value": 65}},
{"Patient Age": {"min_value": 65, "max_value": 100}},
{
"Patient Age": {"min_value": 19, "max_value": 35},
"Patient Gender": {"value": "M"},
},
{
"Patient Age": {"min_value": 19, "max_value": 35},
"Patient Gender": {"value": "F"},
},
{
"Patient Age": {"min_value": 35, "max_value": 65},
"Patient Gender": {"value": "M"},
},
{
"Patient Age": {"min_value": 35, "max_value": 65},
"Patient Gender": {"value": "F"},
},
{
"Patient Age": {"min_value": 65, "max_value": 100},
"Patient Gender": {"value": "M"},
},
{
"Patient Age": {"min_value": 65, "max_value": 100},
"Patient Gender": {"value": "F"},
},
]
# create the slice functions
slice_spec = SliceSpec(spec_list=slices)
nih_eval_results_age = evaluator.evaluate(
dataset=nih_ds,
metrics=[ppv, npv, sensitivity, specificity],
target_columns=pathologies,
prediction_columns="predictions.densenet",
ignore_columns="image",
slice_spec=slice_spec,
)
- Filter -> Patient Age:[19 - 35]: 0%| | 0/661 [00:00<?, ? examples/s]
</pre>
- Filter -> Patient Age:[19 - 35]: 0%| | 0/661 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[19 - 35]: 0%| | 0/661 [00:00<?, ? examples/s]
- Filter -> Patient Age:[19 - 35]: 100%|██████████| 661/661 [00:00<00:00, 43582.84 examples/s]
</pre>
- Filter -> Patient Age:[19 - 35]: 100%|██████████| 661/661 [00:00<00:00, 43582.84 examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[19 - 35]: 100%|██████████| 661/661 [00:00<00:00, 43582.84 examples/s]
- Filter -> Patient Age:[35 - 65]: 0%| | 0/661 [00:00<?, ? examples/s]
</pre>
- Filter -> Patient Age:[35 - 65]: 0%| | 0/661 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[35 - 65]: 0%| | 0/661 [00:00<?, ? examples/s]
- Filter -> Patient Age:[35 - 65]: 100%|██████████| 661/661 [00:00<00:00, 46122.69 examples/s]
</pre>
- Filter -> Patient Age:[35 - 65]: 100%|██████████| 661/661 [00:00<00:00, 46122.69 examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[35 - 65]: 100%|██████████| 661/661 [00:00<00:00, 46122.69 examples/s]
- Filter -> Patient Age:[65 - 100]: 0%| | 0/661 [00:00<?, ? examples/s]
</pre>
- Filter -> Patient Age:[65 - 100]: 0%| | 0/661 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[65 - 100]: 0%| | 0/661 [00:00<?, ? examples/s]
- Filter -> Patient Age:[65 - 100]: 100%|██████████| 661/661 [00:00<00:00, 45540.83 examples/s]
</pre>
- Filter -> Patient Age:[65 - 100]: 100%|██████████| 661/661 [00:00<00:00, 45540.83 examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[65 - 100]: 100%|██████████| 661/661 [00:00<00:00, 45540.83 examples/s]
- Filter -> Patient Age:[19 - 35]&Patient Gender:M: 0%| | 0/661 [00:00<?, ? examples/s]
</pre>
- Filter -> Patient Age:[19 - 35]&Patient Gender:M: 0%| | 0/661 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[19 - 35]&Patient Gender:M: 0%| | 0/661 [00:00<?, ? examples/s]
- Filter -> Patient Age:[19 - 35]&Patient Gender:M: 100%|██████████| 661/661 [00:00<00:00, 40738.75 examples/s]
</pre>
- Filter -> Patient Age:[19 - 35]&Patient Gender:M: 100%|██████████| 661/661 [00:00<00:00, 40738.75 examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[19 - 35]&Patient Gender:M: 100%|██████████| 661/661 [00:00<00:00, 40738.75 examples/s]
- Filter -> Patient Age:[19 - 35]&Patient Gender:F: 0%| | 0/661 [00:00<?, ? examples/s]
</pre>
- Filter -> Patient Age:[19 - 35]&Patient Gender:F: 0%| | 0/661 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[19 - 35]&Patient Gender:F: 0%| | 0/661 [00:00<?, ? examples/s]
- Filter -> Patient Age:[19 - 35]&Patient Gender:F: 100%|██████████| 661/661 [00:00<00:00, 40802.90 examples/s]
</pre>
- Filter -> Patient Age:[19 - 35]&Patient Gender:F: 100%|██████████| 661/661 [00:00<00:00, 40802.90 examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[19 - 35]&Patient Gender:F: 100%|██████████| 661/661 [00:00<00:00, 40802.90 examples/s]
- Filter -> Patient Age:[35 - 65]&Patient Gender:M: 0%| | 0/661 [00:00<?, ? examples/s]
</pre>
- Filter -> Patient Age:[35 - 65]&Patient Gender:M: 0%| | 0/661 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[35 - 65]&Patient Gender:M: 0%| | 0/661 [00:00<?, ? examples/s]
- Filter -> Patient Age:[35 - 65]&Patient Gender:M: 100%|██████████| 661/661 [00:00<00:00, 34249.16 examples/s]
</pre>
- Filter -> Patient Age:[35 - 65]&Patient Gender:M: 100%|██████████| 661/661 [00:00<00:00, 34249.16 examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[35 - 65]&Patient Gender:M: 100%|██████████| 661/661 [00:00<00:00, 34249.16 examples/s]
- Filter -> Patient Age:[35 - 65]&Patient Gender:F: 0%| | 0/661 [00:00<?, ? examples/s]
</pre>
- Filter -> Patient Age:[35 - 65]&Patient Gender:F: 0%| | 0/661 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[35 - 65]&Patient Gender:F: 0%| | 0/661 [00:00<?, ? examples/s]
- Filter -> Patient Age:[35 - 65]&Patient Gender:F: 100%|██████████| 661/661 [00:00<00:00, 42508.32 examples/s]
</pre>
- Filter -> Patient Age:[35 - 65]&Patient Gender:F: 100%|██████████| 661/661 [00:00<00:00, 42508.32 examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[35 - 65]&Patient Gender:F: 100%|██████████| 661/661 [00:00<00:00, 42508.32 examples/s]
- Filter -> Patient Age:[65 - 100]&Patient Gender:M: 0%| | 0/661 [00:00<?, ? examples/s]
</pre>
- Filter -> Patient Age:[65 - 100]&Patient Gender:M: 0%| | 0/661 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[65 - 100]&Patient Gender:M: 0%| | 0/661 [00:00<?, ? examples/s]
- Filter -> Patient Age:[65 - 100]&Patient Gender:M: 100%|██████████| 661/661 [00:00<00:00, 41037.85 examples/s]
</pre>
- Filter -> Patient Age:[65 - 100]&Patient Gender:M: 100%|██████████| 661/661 [00:00<00:00, 41037.85 examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[65 - 100]&Patient Gender:M: 100%|██████████| 661/661 [00:00<00:00, 41037.85 examples/s]
- Filter -> Patient Age:[65 - 100]&Patient Gender:F: 0%| | 0/661 [00:00<?, ? examples/s]
</pre>
- Filter -> Patient Age:[65 - 100]&Patient Gender:F: 0%| | 0/661 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[65 - 100]&Patient Gender:F: 0%| | 0/661 [00:00<?, ? examples/s]
- Filter -> Patient Age:[65 - 100]&Patient Gender:F: 100%|██████████| 661/661 [00:00<00:00, 41257.09 examples/s]
</pre>
- Filter -> Patient Age:[65 - 100]&Patient Gender:F: 100%|██████████| 661/661 [00:00<00:00, 41257.09 examples/s]
end{sphinxVerbatim}
Filter -> Patient Age:[65 - 100]&Patient Gender:F: 100%|██████████| 661/661 [00:00<00:00, 41257.09 examples/s]
- Filter -> overall: 0%| | 0/661 [00:00<?, ? examples/s]
</pre>
- Filter -> overall: 0%| | 0/661 [00:00<?, ? examples/s]
end{sphinxVerbatim}
Filter -> overall: 0%| | 0/661 [00:00<?, ? examples/s]
- Filter -> overall: 100%|██████████| 661/661 [00:00<00:00, 63095.92 examples/s]
</pre>
- Filter -> overall: 100%|██████████| 661/661 [00:00<00:00, 63095.92 examples/s]
end{sphinxVerbatim}
Filter -> overall: 100%|██████████| 661/661 [00:00<00:00, 63095.92 examples/s]
[8]:
fig = px.pie(
values=[nih_ds["Patient Gender"].count("M"), nih_ds["Patient Gender"].count("F")],
names=["Male", "Female"],
)
fig.update_layout(
title="Gender Distribution",
)
report.log_plotly_figure(
fig=fig,
caption="Gender Distribution",
section_name="datasets",
)
fig.show()
[9]:
fig = px.histogram(nih_ds["Patient Age"])
fig.update_traces(showlegend=False)
fig.update_layout(
title="Age Distribution",
xaxis_title="Age",
yaxis_title="Count",
bargap=0.2,
)
report.log_plotly_figure(
fig=fig,
caption="Age Distribution",
section_name="datasets",
)
fig.show()
[10]:
fig = px.bar(x=pathologies, y=[np.array(nih_ds[p]).sum() for p in pathologies])
fig.update_layout(
title="Pathology Distribution",
xaxis_title="Pathology",
yaxis_title="Count",
bargap=0.2,
# change size of plot
)
report.log_plotly_figure(
fig=fig,
caption="Pathology Distribution",
section_name="datasets",
)
fig.show()
Log Performance Metrics as Tests w/ Thresholds#
[11]:
results_flat = {}
for slice_, metrics in nih_eval_results_age["model_for_predictions.densenet"].items():
for name, metric in metrics.items():
results_flat[f"{slice_}/{name}"] = metric.mean()
for itr, m in enumerate(metric):
if slice_ == "overall":
results_flat[f"pathology:{pathologies[itr]}/{name}"] = m
else:
results_flat[f"{slice_}&pathology:{pathologies[itr]}/{name}"] = m
for slice_, metrics in nih_eval_results_gender[
"model_for_predictions.densenet"
].items():
for name, metric in metrics.items():
results_flat[f"{slice_}/{name}"] = metric.mean()
for itr, m in enumerate(metric):
if slice_ == "overall":
results_flat[f"pathology:{pathologies[itr]}/{name}"] = m
else:
results_flat[f"{slice_}&pathology:{pathologies[itr]}/{name}"] = m
for name, metric in results_flat.items():
split, name = name.split("/") # noqa: PLW2901
descriptions = {
"MultilabelPPV": "The proportion of correctly predicted positive instances among all instances predicted as positive. Also known as precision.",
"MultilabelNPV": "The proportion of correctly predicted negative instances among all instances predicted as negative.",
"MultilabelSensitivity": "The proportion of actual positive instances that are correctly predicted. Also known as recall or true positive rate.",
"MultilabelSpecificity": "The proportion of actual negative instances that are correctly predicted.",
}
report.log_quantitative_analysis(
"performance",
name=name,
value=metric.tolist() if isinstance(metric, np.generic) else metric,
description=descriptions[name],
metric_slice=split,
pass_fail_thresholds=0.7,
pass_fail_threshold_fns=lambda x, threshold: bool(x >= threshold),
)
Populate Model Card Fields#
[12]:
# model details for NIH Chest X-Ray model
report.log_from_dict(
data={
"name": "NIH Chest X-Ray Multi-label Classification Model",
"description": "This model is a DenseNet121 model trained on the NIH Chest \
X-Ray dataset, which contains 112,120 frontal-view X-ray images of 30,805 \
unique patients with the fourteen text-mined disease labels from the \
associated radiological reports. The labels are Atelectasis, Cardiomegaly, \
Effusion, Infiltration, Mass, Nodule, Pneumonia, Pneumothorax, \
Consolidation, Edema, Emphysema, Fibrosis, Pleural Thickening, and Hernia. \
The model was trained on 80% of the data and evaluated on the remaining \
20%.",
"references": [{"link": "https://arxiv.org/abs/2111.00595"}],
},
section_name="Model Details",
)
report.log_citation(
citation="""@inproceedings{Cohen2022xrv,
title = {{TorchXRayVision: A library of chest X-ray datasets and models}},
author = {Cohen, Joseph Paul and Viviano, Joseph D. and Bertin, \
Paul and Morrison,Paul and Torabian, Parsa and Guarrera, \
Matteo and Lungren, Matthew P and Chaudhari,\
Akshay and Brooks, Rupert and Hashir, \
Mohammad and Bertrand, Hadrien},
booktitle = {Medical Imaging with Deep Learning},
url = {https://github.com/mlmed/torchxrayvision},
arxivId = {2111.00595},
year = {2022}
}""",
)
report.log_citation(
citation="""@inproceedings{cohen2020limits,
title={On the limits of cross-domain generalization\
in automated X-ray prediction},
author={Cohen, Joseph Paul and Hashir, Mohammad and Brooks, \
Rupert and Bertrand, Hadrien},
booktitle={Medical Imaging with Deep Learning},
year={2020},
url={https://arxiv.org/abs/2002.02497}
}""",
)
report.log_owner(
name="Machine Learning and Medicine Lab",
contact="mlmed.org",
email="joseph@josephpcohen.com",
)
# considerations
report.log_user(description="Radiologists")
report.log_user(description="Data Scientists")
report.log_use_case(
description="The model can be used to predict the presence of 14 pathologies \
in chest X-ray images.",
kind="primary",
)
report.log_descriptor(
name="limitations",
description="The limitations of this model include its inability to detect \
pathologies that are not included in the 14 labels of the NIH \
Chest X-Ray dataset. Additionally, the model may not perform \
well on images that are of poor quality or that contain \
artifacts. Finally, the model may not generalize well to\
populations that are not well-represented in the training \
data, such as patients from different geographic regions or \
with different demographics.",
section_name="considerations",
)
report.log_descriptor(
name="tradeoffs",
description="The model can help radiologists to detect pathologies in \
chest X-ray images, but it may not generalize well to populations \
that are not well-represented in the training data.",
section_name="considerations",
)
report.log_risk(
risk="One ethical risk of the model is that it may not generalize well to \
populations that are not well-represented in the training data,\
such as patients from different geographic regions \
or with different demographics. ",
mitigation_strategy="A mitigation strategy for this risk is to ensure \
that the training data is diverse and representative of the population \
that the model will be used on. Additionally, the model should be \
regularly evaluated and updated to ensure that it continues to \
perform well on diverse populations. Finally, the model should \
be used in conjunction with human expertise to ensure that \
any biases or limitations are identified and addressed.",
)
report.log_fairness_assessment(
affected_group="Patients with rare pathologies",
benefit="The model can help radiologists to detect pathologies in \
chest X-ray images.",
harm="The model may not generalize well to populations that are not \
well-represented in the training data.",
mitigation_strategy="A mitigation strategy for this risk is to ensure that \
the training data is diverse and representative of the population.",
)
[13]:
report_path = report.export(
output_filename="nihcxr_report_periodic.html",
synthetic_timestamp="2023-11-06",
)
shutil.copy(f"{report_path}", ".")
[13]:
'./nihcxr_report_periodic.html'
You can view the generated HTML report.