Tokenizers¶
Base Tokenizer
EncodeResult
¶
BaseTokenizer
¶
Bases: BaseModel, ABC
Base Tokenizer Class.
This abstract class provides the interface for creating Tokenizer objects that converts strings into tokens.
Source code in src/fed_rag/base/tokenizer.py
encode
abstractmethod
¶
Encode the input string into list of integers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input
|
str
|
The input string to be encoded. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
EncodeResult |
EncodeResult
|
The result of encoding. |
Source code in src/fed_rag/base/tokenizer.py
decode
abstractmethod
¶
Decode the input token ids into a string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_ids
|
list[int]
|
The token ids to be decoded back to text. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The decoded text. |