Example use cases#

Tabular data#

Kaggle Heart Failure Prediction#

This is a binary classification problem where the goal is to predict risk of heart disease. The heart failure dataset is available on Kaggle. The dataset contains 11 features and 1 target variable.

MIMICIV Mortality Prediction#

This is a binary classification problem where the goal is to predict risk of in-hospital mortality. The MIMICIV dataset is an EHR dataset collected from a single hospital site, which includes ICU data.

Synthea Prolonged Length of Stay Prediction#

This is a binary classification problem where the goal is to predict whether a patient will have a prolonged length of stay in the hospital (more than 7 days). The synthea dataset is generated using Synthea which is a synthetic patient generator. The dataset contains observations, medications and procedures as features.

Diabetes 130-US Hospitals for Years 1999-2008 Readmission Prediction#

This is a binary classification problem where the goal is to predict risk of readmission. The diabetes dataset is available on UCI Machine Learning Repository. The dataset contains 47 features and 1 target variable.

Image data#

NIH Chest X-ray classification#

This tutorial showcases the use of the tasks API to implement a chest X-ray classification task. The dataset used is the NIH Chest X-ray dataset, which contains 112,120 frontal-view X-ray images of 30,805 unique patients with 14 disease labels.

The tutorial also demonstrates the use of the evaluate API to evaluate the performance of a model on the task.