MIMIC-IV tutorialΒΆ
This notebook shows examples of how to use the cycquery tool on MIMIC-IV v2.0.
Each query is limit to 100 rows (for quick results).
First, setup the MIMIC-IV database according to the instructions in mimic-code.
The database is assumed to be hosted using postgres. Update the config parameters such as username and password, passed to
MIMICIVQuerier
accordingly.
Imports and instantiate MIMICIVQuerier
ΒΆ
[1]:
"""MIMICIV tutorial."""
import cycquery.ops as qo
from cycquery import MIMICIVQuerier
querier = MIMICIVQuerier(
dbms="postgresql",
port=5432,
host="localhost",
database="mimiciv-2.0",
user="postgres",
password="pwd",
)
# List all schemas.
querier.list_schemas()
2025-01-29 09:50:17,993 INFO cycquery.orm - Database setup, ready to run queries!
[1]:
['fhir_etl',
'fhir_trm',
'information_schema',
'mimic_fhir',
'mimiciv_derived',
'mimiciv_ed',
'mimiciv_hosp',
'mimiciv_icu',
'mimiciv_note',
'public']
Example 1. Get all patient admissions from 2021 or later (approx year of admission)ΒΆ
[2]:
patients = querier.patients()
admissions = querier.mimiciv_hosp.admissions()
patient_admissions = patients.join(admissions, on="subject_id")
ops = qo.Sequential(
qo.AddDeltaColumn(["admittime", "dischtime"], years="anchor_year_difference"),
qo.ConditionAfterDate("admittime", "2021-01-01"),
)
patient_admissions = patient_admissions.ops(ops).run(limit=100)
print(f"{len(patient_admissions)} rows extracted!")
2025-01-29 09:50:20,520 INFO cycquery.orm - Query returned successfully!
2025-01-29 09:50:20,522 INFO cycquery.utils.profile - Finished executing function run_query in 0.244878 s
100 rows extracted!
Example 2. Get all patient encounters with diagnoses (schizophrenia
in ICD-10 long title), in the year 2015.ΒΆ
[3]:
patients = querier.patients()
admissions = querier.mimiciv_hosp.admissions()
patient_admissions = patients.join(admissions, on="subject_id")
ops = qo.Sequential(
qo.AddDeltaColumn(["admittime", "dischtime"], years="anchor_year_difference"),
qo.ConditionInYears("admittime", "2015"),
)
patient_admissions = patient_admissions.ops(ops)
diagnoses = querier.diagnoses_icd()
diagnoses_ops = qo.Sequential(
qo.ConditionEquals("icd_version", 10),
qo.ConditionSubstring("long_title", "schizophrenia"),
)
diagnoses = diagnoses.ops(diagnoses_ops)
patient_admissions_diagnoses = patient_admissions.join(
join_table=diagnoses,
on=["subject_id", "hadm_id"],
).run(limit=100)
print(f"{len(patient_admissions_diagnoses)} rows extracted!")
2025-01-29 09:50:23,176 INFO cycquery.orm - Query returned successfully!
2025-01-29 09:50:23,178 INFO cycquery.utils.profile - Finished executing function run_query in 2.019427 s
100 rows extracted!
Example 3. Advanced - uses ConditionRegexMatch
from cycquery.ops
. Get all patient encounters with diagnoses (ICD-9 long title contains schizophrenia
and chronic
), in the year 2015.ΒΆ
[4]:
patients = querier.patients()
admissions = querier.mimiciv_hosp.admissions()
patient_admissions = patients.join(admissions, on="subject_id")
ops = qo.Sequential(
qo.AddDeltaColumn(["admittime", "dischtime"], years="anchor_year_difference"),
qo.ConditionInYears("admittime", "2015"),
)
patient_admissions = patient_admissions.ops(ops)
diagnoses = querier.diagnoses_icd()
diagnoses_ops = qo.Sequential(
qo.ConditionEquals("icd_version", 9),
qo.ConditionRegexMatch("long_title", r"(?=.*schizophrenia)(?=.*chronic)"),
)
diagnoses = diagnoses.ops(diagnoses_ops)
patient_admissions_diagnoses = patient_admissions.join(
join_table=diagnoses,
on=["subject_id", "hadm_id"],
).run(limit=100)
print(f"{len(patient_admissions_diagnoses)} rows extracted!")
2025-01-29 09:50:26,959 INFO cycquery.orm - Query returned successfully!
2025-01-29 09:50:26,960 INFO cycquery.utils.profile - Finished executing function run_query in 3.714916 s
82 rows extracted!
Example 4. Get routine vital signs for patients from year 2015.ΒΆ
[5]:
patients = querier.patients()
admissions = querier.mimiciv_hosp.admissions()
patient_admissions = patients.join(admissions, on="subject_id")
ops = qo.Sequential(
qo.AddDeltaColumn(["admittime", "dischtime"], years="anchor_year_difference"),
qo.ConditionInYears("admittime", "2015"),
)
patient_admissions = patient_admissions.ops(ops)
chart_events = querier.chartevents()
vitals = chart_events.ops(qo.ConditionEquals("category", "Routine Vital Signs"))
patient_admissions_vitals = patient_admissions.join(
join_table=vitals,
on=["subject_id", "hadm_id"],
).run(limit=100)
print(f"{len(patient_admissions_vitals)} rows extracted!")
2025-01-29 09:50:30,251 INFO cycquery.orm - Query returned successfully!
2025-01-29 09:50:30,253 INFO cycquery.utils.profile - Finished executing function run_query in 3.225813 s
100 rows extracted!
Example 5. Get hemoglobin lab tests for patients from year 2009.ΒΆ
[6]:
patients = querier.patients()
admissions = querier.mimiciv_hosp.admissions()
patient_admissions = patients.join(admissions, on="subject_id")
ops = qo.Sequential(
qo.AddDeltaColumn(["admittime", "dischtime"], years="anchor_year_difference"),
qo.ConditionInYears("admittime", "2009"),
)
patient_admissions = patient_admissions.ops(ops)
chart_events = querier.chartevents()
labs = chart_events.ops(qo.ConditionEquals("label", "hemoglobin"))
patient_admissions_labs = patient_admissions.join(
join_table=labs,
on=["subject_id", "hadm_id"],
).run(limit=100)
print(f"{len(patient_admissions_labs)} rows extracted!")
2025-01-29 09:50:38,984 INFO cycquery.orm - Query returned successfully!
2025-01-29 09:50:38,986 INFO cycquery.utils.profile - Finished executing function run_query in 8.673006 s
100 rows extracted!
Example 6. Get all female patient encounters from year 2015, and return batches partitioned and indexed on subject_id
.ΒΆ
[7]:
patients = querier.patients()
admissions = querier.mimiciv_hosp.admissions()
patient_admissions = patients.join(admissions, on="subject_id")
ops = qo.Sequential(
qo.AddDeltaColumn(["admittime", "dischtime"], years="anchor_year_difference"),
qo.ConditionInYears("admittime", "2015"),
qo.Cast("gender", "str"),
qo.ConditionEquals("gender", "F"),
)
patient_admissions = patient_admissions.ops(ops)
patient_admissions = patient_admissions.run(
index_col="subject_id",
batch_mode=True,
batch_size=1000,
)
patient_admissions_df = next(patient_admissions)
print(f"{len(patient_admissions_df)} rows extracted!")
2025-01-29 09:50:40,128 INFO cycquery.orm - Query returned successfully!
2025-01-29 09:50:40,130 INFO cycquery.utils.profile - Finished executing function run_query in 1.047700 s
999 rows extracted!
Example 7. Running a raw SQL string.ΒΆ
[8]:
data = querier.db.run_query("SELECT * FROM mimiciv_hosp.admissions LIMIT 100")
print(f"{len(data)} rows extracted!")
2025-01-29 09:50:40,472 INFO cycquery.orm - Query returned successfully!
2025-01-29 09:50:40,474 INFO cycquery.utils.profile - Finished executing function run_query in 0.015948 s
100 rows extracted!