mmlearn.datasets.processors.transforms.TrimText

class TrimText(trim_size)[source]

Bases: object

Trim text strings as a preprocessing step before tokenization.

Parameters:

trim_size (int) – The maximum length of the trimmed text.

Methods

__call__(sentence)[source]

Trim the given sentence(s).

Parameters:

sentence (Union[str, list[str]]) – Sentence(s) to be trimmed.

Returns:

Trimmed sentence(s).

Return type:

Union[str, list[str]]

Raises:

TypeError – If the input sentence is not a string or list of strings.