evaluation dataset
A set of representative examples used to measure model, agent, retrieval, or workflow performance before and after changes.
The evaluation dataset came from real tickets, stripped of sensitive fields.
The evaluation dataset came from real tickets, stripped of sensitive fields.