cycquery¶
cycquery
is a tool for querying relational databases using a simple
Python API. It is specifically developed to query Electronic Health
Record (EHR) databases. The tool is a wrapper around
SQLAlchemy and can be used to write
SQL-like queries in Python, including joins, conditions, groupby
aggregation and many more.
🐣 Getting Started¶
Installing cycquery using pip¶
python3 -m pip install cycquery
Query postgresql database¶
from cycquery import DatasetQuerier
import cycquery.ops as qo
querier = DatasetQuerier(
dbms="postgresql",
port=5432,
host="localhost",
database="dbname",
user="usename",
password="password",
)
# List all tables.
querier.list_tables()
# Get some table.
table = querier.schema.sometable()
# Filter based on some condition (e.g. substring match).
table = table.ops(qo.ConditionSubstring("col1", "substr"))
# Run query to get data as a pandas dataframe.
df = table.run()
# Create a sequential list of operations to perform on the query.
ops = qo.Sequential(
qo.ConditionIn("col2", [1, 2]),
qo.DropNulls("col3"),
qo.Distinct("col1")
)
table = table.ops(ops)
# Run query to get data as a pandas dataframe.
df = table.run()
Using uv¶
The development environment can be set up using uv. Hence, make sure it is installed and then run:
uv sync
source .venv/bin/activate
In order to install dependencies for testing (codestyle, unit tests, integration tests), run:
uv sync --dev
source .venv/bin/activate
API documentation is built using Sphinx and can be locally built by:
uv sync --group docs
cd docs
make html SPHINXOPTS="-D nbsphinx_allow_errors=True"
Contributing¶
Contributing to cycquery
is welcomed. See
Contributing
for guidelines.