cycquery.interface.QueryInterfaceΒΆ

class QueryInterface(database, query)[source]ΒΆ

Bases: object

An interface dataclass to wrap queries, and run them.

Parameters:

Initialize the QueryInterface object, join and chain operations.

Methods

clear_data

Clear data container.

join

Join the query with another table.

ops

Chain operations with the query.

run

Run the query, and fetch data.

save

Save the query.

union

Union the query with another query.

union_all

Union all the query with another query.

Attributes

data

Get data.

__init__(database, query)[source]ΒΆ

Initialize the QueryInterface object, join and chain operations.

clear_data()[source]ΒΆ

Clear data container.

Sets the data attribute to None, thus clearing the dataframe contained.

Return type:

None

property data: DataFrame | NoneΒΆ

Get data.

join(join_table, on=None, on_to_type=None, cond=None, table_cols=None, join_table_cols=None, isouter=False)[source]ΒΆ

Join the query with another table.

Parameters:
Returns:

QueryInterface object with the join operation added.

Return type:

QueryInterface

ops(ops)[source]ΒΆ

Chain operations with the query.

Parameters:

ops (Union[QueryOp, Sequential]) – Operations to perform on the query.

Returns:

QueryInterface object with the operations added.

Return type:

QueryInterface

run(limit=None, index_col=None, batch_mode=False, batch_size=1000000, dtype_backend='pyarrow')[source]ΒΆ

Run the query, and fetch data.

Parameters:
  • limit (Optional[int]) – No. of rows to limit the query return.

  • backend – Backend computing framework to use, pandas or dask or datasets.

  • index_col (Optional[str]) – Column which becomes the index, and defines the partitioning. Should be a indexed column in the SQL server, and any orderable type.

  • batch_mode (bool) – Whether to run the query in batch mode. A generator is returned if True.

  • batch_size (int) – Batch size for the query, default 1 million rows.

  • dtype_backend (str) – Data type to use for the backend, default pyarrow.

Returns:

Query result.

Return type:

pandas.DataFrame or Generator[pandas.DataFrame, None, None]

save(path, file_format='parquet')[source]ΒΆ

Save the query.

Parameters:
  • path (str) – Path where the file will be saved.

  • file_format (Literal['parquet', 'csv']) – File format of the file to save.

Returns:

Processed save path for upstream use.

Return type:

str

union(other)[source]ΒΆ

Union the query with another query.

Parameters:

other (QueryInterface) – The other query to union with.

Returns:

QueryInterface object with the union operation added.

Return type:

QueryInterface

union_all(other)[source]ΒΆ

Union all the query with another query.

Parameters:

other (QueryInterface) – The other query to union all with.

Returns:

QueryInterface object with the union all operation added.

Return type:

QueryInterface