2026-03-12 · Hana Moriyama
Evaluation cards that internal assistants actually use
How we compress rubrics into single-page cards so reviewers ship consistent verdicts under time pressure.
Internal assistants fail quietly: latency looks fine while quality drifts. We borrowed a print-editorial habit—single-page evaluation cards—and adapted them for bilingual review floors in Tokyo and remote pods.
Each card anchors on one scenario slice: inputs, expected guardrails, and a severity ladder. Reviewers annotate with short codes instead of paragraphs, which keeps telemetry searchable without turning the queue into a blog.
The third paragraph is about maintenance. Cards expire when the underlying model or tool surface changes; we tie card versions to deployment tags so rollback is legible to operations coordinators, not only engineers.
We publish anonymized diffs between card versions so cohort alumni can see why a rubric tightened. That transparency has been cited in procurement packets more often than any marketing line we have written.