Research / TB-R-2026-03 · v1.0 · June 11, 2026 · 13 min

The division of labor in AI-era sales development: a delegation framework

Tenbound Research

Abstract. Field experiments establish that generative AI raises novice performance most and that its capability frontier is jagged: quality rises sharply on tasks inside the frontier and judgment degrades on tasks just outside it. This paper translates those findings into an operating instrument for sales development: the delegation map, which assigns every task in the pipeline motion to one of three lanes (delegate, direct, own), attaches a named review point to the first two, and versions as the frontier moves. We specify the construction protocol, the review-point design rules, and the audit cadence.

Pillar: MasteryPillar: Motion

The research library

One email opens every paper.

The library is read online, in full, free. We ask for your work email once: it registers you as a reader, opens every current and future paper, and subscribes you to the Briefing (one research summary per issue, unsubscribe any time).

Already a reader on this device? The papers are open below.

1. Two findings that set the problem

Brynjolfsson, Li and Raymond (2023) measured a generative assistant across roughly five thousand support agents: productivity rose 14 percent on average, with the gains concentrated among the least experienced workers. Dell’Acqua and colleagues (2023) measured 758 consultants on tasks placed deliberately on both sides of the model’s capability frontier: inside it, AI users’ output quality rose by roughly 40 percent; just outside it, AI users were 19 percentage points more likely to be wrong than unaided peers, because fluent output disarms scrutiny.

Productivity gain, support agents 14 % (largest for novices)

Quality uplift, inside frontier 40 % (consultant tasks)

Added error rate, outside frontier 19 pp more wrong answers

Figure 3. The two headline effects, as reported in the source studies. Left: average productivity gain in the support-agent deployment (largest for novices). Middle and right: the jagged frontier result, quality uplift inside the frontier versus the increase in wrong answers just outside it. Brynjolfsson et al. 2023; Dell'Acqua et al. 2023. Bars show reported magnitudes in their original units.

Together they dissolve the two default postures. “Master the craft first, then add AI” wastes the largest documented gain, which accrues to novices. “Let the AI run” ships the outside-frontier failure at volume. What remains is a placement problem: which tasks sit where, for this team, this quarter.

2. The instrument: three lanes and a review point

The delegation map assigns every recurring task in the motion to one lane.

Delegate. Well inside the frontier; output reliably correct; the system runs it and a human samples it. In current practice: account retrieval, enrichment, signal collection and recency sorting, sequence execution and timing.

Direct. On the frontier; drafts valuable, errors plausible. AI proposes, a human disposes. Copy, account tiering, research summaries, call preparation.

Own. Beyond the frontier or too consequential to delegate regardless: live conversations, qualification judgment, the ICP hypothesis, anything that commits the company.

The lane assignment is necessary but not sufficient. The automation literature’s oldest result is that unmonitored automation breeds complacency (Parasuraman and Riley 1997; Mosier and Skitka 1996), and that trust must be calibrated to measured reliability rather than to fluency (Lee and See 2004). Therefore every Delegate and Direct entry carries a named review point: who looks, at what sample rate, on what cadence, against what rubric. A map without review points is a wish.

3. Construction protocol

The map is built from the task inventory of the actual motion, not from a generic template. For each task: attempt it with the system on ten real instances; grade the ten outputs against the relevant rubric; place the task by its measured failure pattern, not by its demo performance. Tasks fail placement in a characteristic way, confidently wrong on a recognizable subclass (niche verticals, sparse data, novel formats), and that subclass goes into the map as the task’s known weak edge, which the review point samples preferentially.

Three design rules for review points. Sample rates scale with blast radius: customer-visible output is reviewed before send, internal research is sampled after. Reviewers rotate, because a fixed reviewer habituates. And every review writes one line of record, so the audit has data.

4. The map as organizational document

The map is versioned, owned, and audited quarterly, because the frontier moves in both directions: model improvements pull tasks inward, and workflow changes push tasks out. Quarterly, the owner re-runs the ten-instance test on every boundary task and moves what the evidence says to move. New hires read the map on day one; it answers, in one page, what the system does here and what humans sign.

The map is also the cleanest single artifact of AI-era competence, which is why the Institute scores it in certification: a candidate’s map shows, in one page, whether they understand both their motion and their tools.

5. Limitations

The frontier is task-specific and model-specific; published effect sizes transfer as expectations, not as guarantees, and each team’s ten-instance tests are the binding evidence. The framework also assumes a system worth delegating to: a team at Manual maturity has no execution layer to map, and builds one first.