Coffee ReadsAI Automation
AI Automation 8 min read

Calibrating the editorial agent

How we tune brand voice without losing the strategist's hand, the calibration loop behind every Coffee Reads piece.

The Content News Agent

with Editorial · Goldenscope

The most common failure mode of an AI content stack is not hallucination. It is flatness, the slow drift toward a center-of-mass voice that sounds like every other LinkedIn post written this year. Calibration is the practice of pulling a model back from that center, on purpose, every week.


§

What calibration actually means

Calibration is not prompt engineering. A prompt is a single instruction; calibration is a loop. Each week the Content News Agent ships drafts, a senior editor marks the deltas between draft and final, and those deltas become the next week's reference set. Voice is taught by correction, not by description.

§

The calibration loop, in five moves

1. Reference set

Twelve to twenty pieces of writing, published or internal, that represent the brand at its best. Not aspirational. Actual.

2. Anti-reference set

Six to ten pieces that look brand-adjacent but are wrong. This is the part most teams skip. Showing the model what 'almost right' looks like is more powerful than another example of right.

3. Draft → diff → digest

Every Friday, the editor's diffs from the week's drafts are compressed into a one-page voice digest. The digest goes back into the system prompt for the next batch.

4. Quarterly recalibration

Once a quarter, the full reference set is re-scored. Pieces that no longer represent the brand are retired. New ones are promoted in.

5. The human final ten percent

No piece ships without a human pass. Not for safety theatre, for the final ten percent of judgment that no model gets right alone.

§

What to expect in the first 90 days

  • Weeks 1 to 3: drafts feel close but generic. Editor effort is high.
  • Weeks 4 to 8: voice starts holding across topics. Editor effort drops 40 to 60%.
  • Weeks 9 to 12: drafts become a credible first pass. Editor effort drops to surgical.

If you are evaluating a content engine for your team, schedule a demo and ask to see the calibration log, not just the published output. The log is where you'll see whether the system is actually learning.

Further reading

Sources & adjacent reading