Tokenizers¶
Base Tokenizer
EncodeResult
¶
BaseTokenizer
¶
Bases: BaseModel
, ABC
Base Tokenizer Class.
This abstract class provides the interface for creating Tokenizer objects that converts strings into tokens.
Source code in src/fed_rag/base/tokenizer.py
encode
abstractmethod
¶
Encode the input string into list of integers.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input
|
str
|
The input string to be encoded. |
required |
Returns:
Name | Type | Description |
---|---|---|
EncodeResult |
EncodeResult
|
The result of encoding. |
Source code in src/fed_rag/base/tokenizer.py
decode
abstractmethod
¶
Decode the input token ids into a string.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_ids
|
list[int]
|
The token ids to be decoded back to text. |
required |
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
The decoded text. |