How is your model doing?
A quick glance of your most important metrics.
Last 0 Evaluations
PPV
The proportion of correctly predicted positive instances among all instances predicted as positive. Also known as precision.
0.14
▼
0.7
minimum
threshold
minimum
threshold
NPV
The proportion of correctly predicted negative instances among all instances predicted as negative.
0.93
▲
0.7
minimum
threshold
minimum
threshold
Sensitivity
The proportion of actual positive instances that are correctly predicted. Also known as recall or true positive rate.
0.82
▲
0.7
minimum
threshold
minimum
threshold
Specificity
The proportion of actual negative instances that are correctly predicted.
0.32
▼
0.7
minimum
threshold
minimum
threshold
Last 0 Evaluations
How is your model doing over time?
See how your model is performing over several metrics and subgroups over time.
Multi-plot Selection:
Metrics
Patient Age
pathology
Patient Gender
Datasets
Graphics
Quantitative Analysis
PPV
The proportion of correctly predicted positive instances among all instances predicted as positive. Also known as precision.
0.14
▼
0.7
minimum
threshold
minimum
threshold
NPV
The proportion of correctly predicted negative instances among all instances predicted as negative.
0.93
▲
0.7
minimum
threshold
minimum
threshold
Sensitivity
The proportion of actual positive instances that are correctly predicted. Also known as recall or true positive rate.
0.82
▲
0.7
minimum
threshold
minimum
threshold
Specificity
The proportion of actual negative instances that are correctly predicted.
0.32
▼
0.7
minimum
threshold
minimum
threshold
Model Details
Description
This model is a DenseNet121 model trained on the NIH Chest X-Ray dataset, which contains 112,120 frontal-view X-ray images of 30,805 unique patients with the fourteen text-mined disease labels from the associated radiological reports. The labels are Atelectasis, Cardiomegaly, Effusion, Infiltration, Mass, Nodule, Pneumonia, Pneumothorax, Consolidation, Edema, Emphysema, Fibrosis, Pleural Thickening, and Hernia. The model was trained on 80% of the data and evaluated on the remaining 20%.Owners
-
Name: Machine Learning and Medicine Lab
Contact: mlmed.org
Email: joseph@josephpcohen.com
Citations
- @inproceedings{Cohen2022xrv, title = {{TorchXRayVision: A library of chest X-ray datasets and models}}, author = {Cohen, Joseph Paul and Viviano, Joseph D. and Bertin, Paul and Morrison,Paul and Torabian, Parsa and Guarrera, Matteo and Lungren, Matthew P and Chaudhari, Akshay and Brooks, Rupert and Hashir, Mohammad and Bertrand, Hadrien}, booktitle = {Medical Imaging with Deep Learning}, url = {https://github.com/mlmed/torchxrayvision}, arxivId = {2111.00595}, year = {2022} }
- @inproceedings{cohen2020limits, title={On the limits of cross-domain generalization in automated X-ray prediction}, author={Cohen, Joseph Paul and Hashir, Mohammad and Brooks, Rupert and Bertrand, Hadrien}, booktitle={Medical Imaging with Deep Learning}, year={2020}, url={https://arxiv.org/abs/2002.02497} }
References
Name
NIH Chest X-Ray Multi-label Classification ModelConsiderations
Users
- Radiologists
- Data Scientists
Use Cases
-
The model can be used to predict the presence of 14 pathologies in chest X-ray images.
Kind: primary
Fairness Assessment
-
Affected Group: Patients with rare pathologies
Benefits: The model can help radiologists to detect pathologies in chest X-ray images.
Harms: The model may not generalize well to populations that are not well-represented in the training data.
A mitigation strategy for this risk is to ensure that the training data is diverse and representative of the population.
Ethical Considerations
-
A mitigation strategy for this risk is to ensure that the training data is diverse and representative of the population that the model will be used on. Additionally, the model should be regularly evaluated and updated to ensure that it continues to perform well on diverse populations. Finally, the model should be used in conjunction with human expertise to ensure that any biases or limitations are identified and addressed.
Risk: One ethical risk of the model is that it may not generalize well to populations that are not well-represented in the training data, such as patients from different geographic regions or with different demographics.
Limitations
-
The limitations of this model include its inability to detect pathologies that are not included in the 14 labels of the NIH Chest X-Ray dataset. Additionally, the model may not perform well on images that are of poor quality or that contain artifacts. Finally, the model may not generalize well to populations that are not well-represented in the training data, such as patients from different geographic regions or with different demographics.
Tradeoffs
-
The model can help radiologists to detect pathologies in chest X-ray images, but it may not generalize well to populations that are not well-represented in the training data.