Evals
Data structures for fed_rag.evals
BenchmarkExample
¶
Bases: BaseModel
Benchmark example data class.
This class represents a single benchmark example.
Attributes:
Name | Type | Description |
---|---|---|
query(str) |
Query string. |
|
response(str) |
Response string. |
|
context(str|None) |
Context string. |
Source code in src/fed_rag/data_structures/evals.py
BenchmarkResult
¶
Bases: BaseModel
Benchmark result data class.
This class represents the result of a benchmark.
Attributes:
Name | Type | Description |
---|---|---|
score(float) |
Score of the benchmark example. |
|
metric_name(str) |
Name of the metric used for scoring. |
|
num_examples_used(int) |
Number of examples used for scoring. |
|
num_total_examples(int) |
Number of total examples in the benchmark. |
|
evaluations_file(str|None) |
Path to the evaluations file. |
Source code in src/fed_rag/data_structures/evals.py
BenchmarkEvaluatedExample
¶
Bases: BaseModel
Evaluated benchmark example data class.
This class represents an evaluated benchmark example.
Attributes:
Name | Type | Description |
---|---|---|
score(float) |
Score of the benchmark example. |
|
example(BenchmarkExample) |
Benchmark example. |
|
rag_response(RAGResponse) |
RAG response. |
Source code in src/fed_rag/data_structures/evals.py
model_dump_json_without_embeddings
¶
Generates and returns a JSON representation of the model excluding specific embedding-related data.
Source code in src/fed_rag/data_structures/evals.py
AggregationMode
¶
Bases: str
, Enum
Mode for aggregating evaluation scores.
This enum defines the available modes for aggregating multiple evaluation scores into a single value.
Attributes:
Name | Type | Description |
---|---|---|
AVG |
Calculates the arithmetic mean of the scores. |
|
SUM |
Calculates the sum of all scores. |
|
MAX |
Takes the maximum score value. |
|
MIN |
Takes the minimum score value. |