Skip to content

Base Data Collator

Base Data Collator

BaseDataCollator

Bases: BaseModel, ABC

Base Data Collator.

Abstract base class for collating input examples into batches that can be used by a retrieval-augmented generation (RAG) system.

Source code in src/fed_rag/base/data_collator.py
class BaseDataCollator(BaseModel, ABC):
    """
    Base Data Collator.

    Abstract base class for collating input examples into batches that can
    be used by a retrieval-augmented generation (RAG) system.
    """

    model_config = ConfigDict(arbitrary_types_allowed=True)
    rag_system: RAGSystem

    @abstractmethod
    def __call__(self, features: list[dict[str, Any]], **kwargs: Any) -> Any:
        """Collate examples into a batch.

        Args:
            features (list[dict[str, Any]]): A list of feature dictionaries,
                where each dictionary represents one example.
            **kwargs (Any): Additional keyword arguments that may be used
                by specific implementations.

        Returns:
            Any: A collated batch, with format depending on the implementation.
        """

__call__ abstractmethod

__call__(features, **kwargs)

Collate examples into a batch.

Parameters:

Name Type Description Default
features list[dict[str, Any]]

A list of feature dictionaries, where each dictionary represents one example.

required
**kwargs Any

Additional keyword arguments that may be used by specific implementations.

{}

Returns:

Name Type Description
Any Any

A collated batch, with format depending on the implementation.

Source code in src/fed_rag/base/data_collator.py
@abstractmethod
def __call__(self, features: list[dict[str, Any]], **kwargs: Any) -> Any:
    """Collate examples into a batch.

    Args:
        features (list[dict[str, Any]]): A list of feature dictionaries,
            where each dictionary represents one example.
        **kwargs (Any): Additional keyword arguments that may be used
            by specific implementations.

    Returns:
        Any: A collated batch, with format depending on the implementation.
    """