Concrete workplace scenario
A manager says: “We approved a reusable prompt case. How will we know whether it actually improves the work?”
It sounds like you should list impressive metrics. The trap is that a metric is useless unless it has evidence, owner, cadence, and a decision rule.
Worked answer
Bad example first: the plausible shortcut fails
Weak example
“Track engagement, quality, time saved, and user feedback.”
What goes wrong
The weak version contains engagement, no evidence location, no owner, no decision rule.
Work-ready version: use the day rule
Signal
What will show the work changed?
Evidence
Where will the proof come from?
Owner
Who updates and reviews it?
Decision rule
What result changes the next action?
Quick decision check
The metric must change a decision rule
Use this rule before filling the production table. It turns the day from abstract AI advice into a work-ready habit.
Signal
What will show the work changed?
Evidence
Where will the proof come from?
Owner
Who updates and reviews it?
Production workflow trace
Before / after: what changed in your work?
Before learning
After learning
What AI-in-workflow means today
Manager request
Your manager does not need a polished-looking document. They need a usable deliverable: a small metric tracking plan that says what to track, where evidence comes from, who updates it, when it is reviewed, and what result changes the next action.
Concrete workplace scenario
A team is testing whether reusable prompt cases from D14 improve customer response drafts and internal status briefs. D15 turns that library into a tracking plan the manager can actually review.
Artifact handoff
Previous artifact
D14 manager-readable prompt case library
Next artifact
D16 evidence_pack.md
Learning checkpoint: Use the previous artifact as working input, produce today's artifact, and hand it forward with evidence, limitation, owner, and repair needs visible.
Metric plan pressure test
A manager asks, 'How will we know this deliverable worked?' A weak answer lists tasks completed. A useful answer names reader, decision, metric signal, evidence location, owner, cadence, and a done standard.
Worked answer
A manager asks, 'How will we know this deliverable worked?' A weak answer lists tasks completed. A useful answer names reader, decision, metric signal, evidence location, owner, cadence, and a done standard.
Repair move
Weak: 'Track engagement.' Repaired: 'Support lead checks reply rate and escalations every Friday; pass if fewer than three unclear-owner replies.'
Metric row repair
Weak: 'Track engagement.' Repaired: 'Support lead checks reply rate and escalations every Friday; pass if fewer than three unclear-owner replies.'
Deliverable anatomy check
A deliverable is usable when five anchors are visible: reader, decision/use, form, evidence, and done standard.
| Anchor | Status | Evidence or repair note |
|---|---|---|
| Reader | ||
| Decision/use | ||
| Form | ||
| Evidence | ||
| Done standard |
Weak vs work-ready deliverable
Weak output
“Here are possible metrics: views, clicks, signups, feedback, engagement.” It looks useful, but nobody knows evidence, owner, cadence, or what action changes.
Work-ready output
“For the 7-day prompt-case test, track customer draft revision count, manager correction count, and time to first usable brief. Review after five samples.”
Build metric tracking plan
Core path: define three metrics. Deep path: define five and include a decision rule for each.
| No. | Metric / evidence signal | Reader | Evidence used | Owner | Evidence readiness | D16 evidence question | Cadence | Decision rule / done standard |
|---|---|---|---|---|---|---|---|---|
| 1 | ||||||||
| 2 | ||||||||
| 3 |
Redline / quality gate
Do not call a deliverable complete if it has no reader, decision/use, evidence used, owner, cadence, or done standard.
Passes when
Each metric has a evidence, owner, cadence, and decision use.
Repair if
The plan is a pile of impressive metrics without collectable evidence or a decision rule.
D16 evidence-pack handoff
Evidence needs for D16
Use the metric rows above as a per-metric evidence-needs list for D16. D16 should collect or verify evidence access, owner, status, and the open question for each metric.
Next action
Repair one weak row
Use this pass to find the row that could confuse someone, create unsafe AI use, or fail as workplace evidence before you export.
Repair example
Weak row: Metric: engagement.
Repair move: Add evidence source, owner, cadence, and done standard.
Better row: Support lead checks customer draft revision count every Friday; continue if five samples need fewer than two manager corrections.
Choose the row you would be comfortable showing to a manager. It should have a clear task, boundary, owner, and next action.
Choose the row most likely to fail, confuse someone, expose sensitive input, or need review.
Write the exact change. Sentence starter: The weak row is weak because ___. I will repair it by adding ___ before using AI.
Export artifact
This file is today’s work receipt. It should show what changed in your work habit, not just that you filled a table.
Before you export
- Does the metric name a real signal, not a vanity label?
- Does it include evidence source, owner, and cadence?
- Does the done standard say what decision changes next?
What this artifact proves: This file proves you can turn a prompt case into a trackable metric plan.
Weak export: broad metrics like engagement with no source, owner, cadence, or decision rule.
Good export: one metric tied to evidence source, owner, review cadence, and a done standard that changes the next action.
Click Generate Markdown to create metric_tracking_plan_pack.md.