G2.1 Guide · Cross-cutting

← Getting started

How we work with Claude.

What Claude drafts. What Claude verifies. What a human operator verifies. What never leaves a human. This is the map of the division of labor that governs every deliverable we ship.

Length: 18 min Audience: all roles Last updated: 2026-04-19

Why this guide exists

The phrase “AI-native” has enough elasticity to mean almost anything. We wrote this guide so the elasticity does not apply to our work. Everything below is a concrete claim about what happens when you engage us. If the claim turns out to be false at some point in the engagement, we have failed the policy, not the other way around.

There are four roles in every deliverable:

  • Claude drafts. Generates the first version of artifacts: report prose, finding writeups, spec sections, page copy, email sequences.
  • Claude verifies. Cross-checks its own output against the source material, against the template schema, against the eval harness. Catches its own obvious errors before we see them.
  • Operator verifies. A named human reviewer signs off on every deliverable. The signature is not symbolic; the operator owns the content in the way a lawyer owns a filing.
  • Operator authors. Some work is never touched by Claude. Client judgment calls, scope negotiations, exploit validation against production, retrospectives. We say so below, explicitly, where it applies.

The seven engagement phases

A standard engagement has seven phases. At each phase, the Claude-operator division is different. We map them here in order.

Phase 1: Intake

Operator-led. Claude involvement: light. We interview the client, walk through the intake questionnaire, and do the first-pass scoping together with the client's team. Claude summarizes the interview transcripts and suggests questions for the next round, but the intake itself is a human conversation. We do not route clients through an AI intake bot. Too much of the useful information in intake is the thing the client almost did not say.

Phase 2: Scoping and quote

Operator-led. Claude involvement: template filling. Claude generates the first draft of the statement of work and pricing line-items from the intake notes and our services pricing library. The operator edits for fit and negotiates with the client. The final SOW carries the operator's signature, not Claude's.

Phase 3: Research and discovery

Shared. Claude involvement: heavy. This is where Claude earns most of the efficiency we pass to the client. Ingesting source materials, summarizing them against the handbook's intent markers, pulling claims and citations, producing research briefs for the operator to read before the deep work. The operator reads every brief before acting on it.

Phase 4: Technical work (the varying phase)

Service-dependent. The content of this phase changes by service line. For cybersecurity, it is the pentest and the architecture review; for web and product, it is the build; for SEO and marketing, it is the audit and positioning work.

Claude's involvement here is always bounded by a simple rule: Claude does not make claims that carry legal, safety, or contractual weight without an operator attaching their name to the claim. Exploit validation is the clearest case: we never ship a finding marked “exploitable” without the operator having reproduced it. Claude can say a pattern looks exploitable; saying it is exploitable requires a human who has run the exploit.

For build work (web, product) the rule takes a different form: Claude writes code that a human reviews line-by-line before it enters the repo, and every PR is signed by an operator. For SEO and GEO, the rule is that Claude drafts content against the brief but does not ship content that misrepresents the client's position, and the operator reviews for voice and factual accuracy against the interviews in phase 1.

Phase 5: Drafting the handbook

Claude-led, operator-directed. This is where Claude is strongest. The Nexmill template library contains structured templates per service line; the intake plus phase-4 outputs plus the template yield a handbook draft. The operator reads the draft end-to-end before the client sees anything. Sections fail the review on any of three grounds: factual accuracy, voice consistency, or readability. Sections that fail are rewritten by the operator, not re-prompted.

Phase 6: Client review and iteration

Operator-led. Claude involvement: processing feedback into structured revisions. The client reads the handbook. Comments come back. The operator triages, decides which changes require a rewrite and which require a clarification, and updates the handbook. Claude helps with the mechanical revision but not with deciding what the client meant by a comment. That is an operator judgment call.

Phase 7: Handover and retrospective

Operator-led. Claude involvement: none on the retrospective. We write our honest-mistake list ourselves. It would be a category error to have the system that made the mistakes grade the mistakes. Retrospectives are how we get better; we do not delegate them.

What never goes to Claude

Three categories of information never leave a human at NexcurAI, regardless of Zero Retention or any other safeguard.

  • Client secrets that are not required for the work. If the engagement does not require the production AWS keys, we do not collect them. If it does require them, they stay in the client's vault and we operate with scoped, time-bounded credentials.
  • Third-party private data that the client is legally forbidden from sharing. Customer PII subject to HIPAA, PCI data, legally privileged material. We either structure the engagement around synthetic or redacted data, or we decline the engagement.
  • Personnel information. Claude never sees staffing decisions, compensation discussions, performance evaluations, or anything adjacent to HR.

How to tell AI-assisted work from AI slop

The test at the client's end is simple. Good AI-assisted work has the following properties:

  • Specific. The writeup references named systems, named people, specific file paths, specific configurations. If every detail is something an LLM could have guessed from a generic description of your company, it probably did.
  • Falsifiable. The work makes claims that you could verify with a week of investigation. Generic best-practices summaries are not falsifiable because they do not refer to your situation closely enough to be wrong about it.
  • Signed. A named human is responsible for the claims. You know who to call if something in the deliverable turns out to be wrong.
  • Honest about itself. A deliverable that acknowledges which parts Claude drafted and which parts the operator authored is a deliverable whose authors understood their own process.

Bad AI-assisted work has the inverse properties: generic, hedged, unsigned, silent about its own origin. If you receive a deliverable from any consultant, AI-native or otherwise, that reads like a generic checklist with your company's name pasted into it, you have paid for slop. Send it back.

The economics of the division of labor

This split is what makes fixed-scope pricing viable. Without Claude, a six-week growth-tier security engagement would cost roughly twice what we charge today; the bulk of the savings come from Claude absorbing the research-and-drafting surface. The operator's time is spent on the judgment-heavy work, which is where human hours should be spent anyway.

The argument is in full in the hourly-billing essay, but the short version is: when Claude does the work Claude is good at, billing you for hours spent on that work is a conflict of interest. We charge for the deliverable.

What Claude is bad at (so you know what to watch for)

  • Long-horizon factual consistency when the source material is large and contradictory. We handle this with structured intake and template constraints; without those, Claude drifts.
  • Tone and voice at the margin. Claude writes fine prose by default; it writes our prose only because the operator keeps voice discipline in review.
  • Knowing what it does not know. Claude has improved materially here, but it still overconfidently states things it has not verified. We treat every Claude output as a claim that needs an evidence trail before it ships.
  • Novel exploit synthesis. Claude is useful for surveying known attack classes against a new surface. It is not useful for inventing new exploit primitives from scratch. That is still a human specialist's job.

What Claude is very good at

  • Structured, constrained drafting against a template. This is where the quality-time tradeoff is the steepest.
  • Coverage expansion. Claude rarely misses a surface when given a structured checklist; human operators skip steps under time pressure.
  • Evidence assembly. Claude is excellent at pulling claims plus evidence plus source citations into a structured writeup.
  • Consistency across a long deliverable. Once voice is set, Claude holds it for longer than most humans can.
  • Iteration speed. The time from “we should restructure chapter 3” to a restructured chapter 3 is minutes, not days.

What you should ask us in the first call

  1. Which parts of this engagement would Claude draft, which would an operator verify, and which would an operator author?
  2. How will you show me the provenance of a specific claim in the handbook?
  3. What happens if Claude is unavailable halfway through the engagement?
  4. Who is the named reviewer for this engagement, and what is their review discipline?
  5. What is the eval harness looking for on this work, and can I see an example of a prompt that failed and was rewritten?

We answer all five at the first call. If any of these questions feels like a gotcha, the right way to treat it is as due diligence.

Related