2026-03-12 · Hana Moriyama

Evaluation cards that internal assistants actually use

How we compress rubrics into single-page cards so reviewers ship consistent verdicts under time pressure.

Desk with printed evaluation sheets and a single fountain pen

Internal assistants fail quietly: latency looks fine while quality drifts. We borrowed a print-editorial habit—single-page evaluation cards—and adapted them for bilingual review floors in Tokyo and remote pods.

Each card anchors on one scenario slice: inputs, expected guardrails, and a severity ladder. Reviewers annotate with short codes instead of paragraphs, which keeps telemetry searchable without turning the queue into a blog.

The third paragraph is about maintenance. Cards expire when the underlying model or tool surface changes; we tie card versions to deployment tags so rollback is legible to operations coordinators, not only engineers.

We publish anonymized diffs between card versions so cohort alumni can see why a rubric tightened. That transparency has been cited in procurement packets more often than any marketing line we have written.