---
title: Fallback model plan
owner: the operator
cadence: quarterly fire-drill, monthly policy review
first_drill: pending (scheduled 2026-05-15)
last_updated: 2026-04-19
---

# Fallback model plan

We commit in [business-plan/05_risks_and_mitigations.md](../business-plan/05_risks_and_mitigations.md) and on [/rules-of-engagement.html](/rules-of-engagement.html) to a named fallback plan per service line if Claude is unavailable, rate-limited, or if policy changes block a specific workload. This file is the plan.

The plan records: per-service-line primary model, the named fallback family, the expected quality delta (honest assessment, not a claim of equivalence), the timeline to cut over, the dependency list (what we would need to rewrite in the template or pipeline), and the last date we ran a drill.

## Principles

1. **Claude is primary.** Every service line is designed around Claude today. We do not apologize for the single-vendor dependency; we manage it.
2. **Fallbacks are real, not rhetorical.** For every service line, we have identified a model that we have validated end-to-end on a non-production engagement at least once.
3. **Quality deltas are stated honestly.** If we would ship a lower-quality handbook on the fallback, we say so. Clients would rather know.
4. **Drills are scheduled.** Once per quarter we run a full fire drill: pick a random service line, generate that line's signature output on the fallback, run it through the eval harness, log the result.
5. **Client contracts include the substitution clause.** If we must switch, the client is notified inside 48 hours, the substitution is logged in the engagement record, and the client has the option to pause or exit at no penalty.

## Per-service-line plan

### Cybersecurity

- **Primary:** Claude (Opus for drafting, Sonnet for high-volume finding writeups).
- **Fallback family:** OpenAI GPT series (GPT-5 class model at the time of cutover) as general fallback. For structured finding writeups specifically, we have evaluated open-weight Mistral and Llama variants and consider them acceptable for the finding-stub path only.
- **Expected quality delta:** noticeable. Security prose benefits from Claude's instruction-following fidelity. On the fallback, we would expect higher editorial time (plus ~35% per handbook) and a small reduction in narrative coherence across attack paths.
- **Timeline to cut over:** under 72 hours to the OpenAI fallback at the pipeline level. Our templates and eval harness are model-agnostic by design.
- **Dependencies to rewrite:** prompt templates in `templates/security/*` carry model-specific phrasing hints. We maintain a parallel OpenAI-adapted copy, kept in sync quarterly.
- **Last drill:** not yet. First drill scheduled 2026-05-15.

### Web development

- **Primary:** Claude (Sonnet for code generation, Opus for architectural reasoning).
- **Fallback family:** OpenAI GPT series for general work, plus open-weight Code Llama variants for constrained code generation.
- **Expected quality delta:** moderate. Code quality holds up well on the fallbacks; architecture reasoning degrades more. Our review gate would catch both classes of regression.
- **Timeline to cut over:** under 48 hours.
- **Dependencies to rewrite:** minimal. The web-dev templates are the most model-agnostic in our library.
- **Last drill:** not yet.

### Product development

- **Primary:** Claude (Opus for design and RAG, Sonnet for evaluation harness scoring).
- **Fallback family:** OpenAI GPT series; additionally, for evaluation-harness scoring we have validated Cohere's Command family as a cross-check scorer.
- **Expected quality delta:** moderate. LLM-as-judge scoring is the most model-sensitive step; we maintain a dual-scorer reference set so we can detect drift if we have to swap the judge.
- **Timeline to cut over:** under 72 hours.
- **Dependencies to rewrite:** the eval harness scorer prompts need model-specific tuning. Maintained as a separate module, not template-coupled.
- **Last drill:** not yet.

### SEO and GEO

- **Primary:** Claude (Opus for editorial, Sonnet for share-of-answer auditing runs).
- **Fallback family:** OpenAI GPT series for editorial, plus a direct HTTP scraper for the share-of-answer study (we can run the study without any LLM by hitting the answer engines directly; the LLM in this workflow is for post-processing, which is replaceable).
- **Expected quality delta:** small on auditing, moderate on editorial.
- **Timeline to cut over:** under 48 hours.
- **Dependencies to rewrite:** editorial voice templates carry Claude-specific phrasing hints. We maintain a parallel voice calibration for the OpenAI fallback.
- **Last drill:** not yet.

### Marketing and lead generation

- **Primary:** Claude (Opus for positioning briefs and long-form, Sonnet for outbound personalization).
- **Fallback family:** OpenAI GPT series for all workloads.
- **Expected quality delta:** small to moderate. Positioning work depends on Claude's tendency to hold a voice consistently across long output; the fallback requires more editorial cleanup.
- **Timeline to cut over:** under 48 hours.
- **Dependencies to rewrite:** minimal.
- **Last drill:** not yet.

## Workload types that have no fallback

Being explicit about this is the ethical way to run fallbacks. Two workloads today have no practical substitute:

- **Extended-thinking-heavy research sprints** that use Claude's multi-hundred-thousand-token context. OpenAI's current context ceilings are workable but not equivalent for some engagement types. We would scope any such engagement to avoid hard Claude dependence unless the client accepts the restriction.
- **Eval harness judge runs** where we rely on Claude's calibration to our scoring rubric. We maintain a cross-validated Cohere scorer precisely because this workload has the least model-swap tolerance.

## Drill schedule

Quarterly. Recorded here when run.

| Quarter | Date | Service line drilled | Result | Editorial time impact | Notes |
|---|---|---|---|---|---|
| 2026-Q2 | 2026-05-15 (planned) | Cybersecurity | pending | - | First drill. |

After each drill we update the per-service-line entry above with the current quality delta estimate and any templates that need re-tuning.

## Communication to clients

The substitution clause is in the engagement contract at section 8.3 ("Model availability and substitution"). It says: if the primary model is unavailable or policy-blocked, we notify the client within 48 hours, explain the substitution plan, re-price the engagement if the substitution materially affects deliverable quality or timeline, and offer the client a clean exit if the re-pricing is unacceptable. No client has exercised this clause at the time of writing because no substitution has occurred.
