mmlearn.modules.encoders.text module

Huggingface text encoder model.

class HFTextEncoder(model_name_or_path, pretrained=True, pooling_layer=None, freeze_layers=False, freeze_layer_norm=True, peft_config=None, model_config_kwargs=None)[source]

Bases: Module

Wrapper around huggingface models in the AutoModelForTextEncoding class.

Parameters:
  • model_name_or_path (str) – The huggingface model name or a local path from which to load the model.

  • pretrained (bool, default=True) – Whether to load the pretrained weights or not.

  • pooling_layer (Optional[torch.nn.Module], optional, default=None) – Pooling layer to apply to the last hidden state of the model.

  • freeze_layers (Union[int, float, list[int], bool], default=False) – Whether to freeze layers of the model and which layers to freeze. If True, all model layers are frozen. If it is an integer, the first N layers of the model are frozen. If it is a float, the first N percent of the layers are frozen. If it is a list of integers, the layers at the indices in the list are frozen.

  • freeze_layer_norm (bool, default=True) – Whether to freeze the layer normalization layers of the model.

  • peft_config (Optional[PeftConfig], optional, default=None) – The configuration from the peft library to use to wrap the model for parameter-efficient finetuning.

  • model_config_kwargs (Optional[dict[str, Any]], optional, default=None) – Additional keyword arguments to pass to the model configuration.

Raises:

ValueError – If the model is a decoder model or if freezing individual layers is not supported for the model type.

Warns:

UserWarning – If both peft_config and freeze_layers are set. The peft_config will override the freeze_layers setting.

forward(inputs)[source]

Run the forward pass.

Parameters:

inputs (dict[str, Any]) – The input data. The input_ids will be expected under the Modalities.TEXT key.

Returns:

The output of the model, including the last hidden state, all hidden states, and the attention weights, if output_attentions is set to True.

Return type:

BaseModelOutput