golden dataset

A curated set of examples with trusted expected outcomes — used to test model or workflow behavior and catch regressions. Built from real cases, not invented ones.
The golden dataset included the weird edge cases operators cared about.