eval rubric
The scoring criteria used to judge whether model or agent output is good enough for the use case — defined by what the customer actually cares about, not by fluency.
The eval rubric penalized answers without cited policy sections.
The eval rubric penalized answers without cited policy sections.