cycquery¶
cycquery
is a tool for querying relational databases using a simple Python API. It is specifically developed to query Electronic Health Record (EHR) databases. The tool is a wrapper around SQLAlchemy and can be used to write SQL-like queries in Python, including joins, conditions, groupby aggregation and many more.
🐣 Getting Started¶
Installing cycquery using pip¶
python3 -m pip install cycquery
Query postgresql database¶
from cycquery import DatasetQuerier
import cycquery.ops as qo
querier = DatasetQuerier(
dbms="postgresql",
port=5432,
host="localhost",
database="dbname",
user="usename",
password="password",
)
# List all tables.
querier.list_tables()
# Get some table.
table = querier.schema.sometable()
# Filter based on some condition (e.g. substring match).
table = table.ops(qo.ConditionSubstring("col1", "substr"))
# Run query to get data as a pandas dataframe.
df = table.run()
# Create a sequential list of operations to perform on the query.
ops = qo.Sequential(
qo.ConditionIn("col2", [1, 2]),
qo.DropNulls("col3"),
qo.Distinct("col1")
)
table = table.ops(ops)
# Run query to get data as a pandas dataframe.
df = table.run()
🧑🏿💻 Developing¶
Using poetry¶
The development environment can be set up using poetry. Hence, make sure it is installed and then run:
python3 -m poetry install
source $(poetry env info --path)/bin/activate
In order to install dependencies for testing (codestyle, unit tests, integration tests), run:
python3 -m poetry install --with test
API documentation is built using Sphinx and can be locally built by:
python3 -m poetry install --with docs
cd docs
make html SPHINXOPTS="-D nbsphinx_allow_errors=True"
Contributing¶
Contributing to cycquery
is welcomed. See Contributing for guidelines.