Research / TB-R-2026-01 · v1.0 · June 11, 2026 · 14 min

The Pipeline Maturity Model: a measurement framework for sales development organizations

Tenbound Research

Abstract. Maturity models fail when they are badges rather than instruments. This paper specifies the Tenbound Pipeline Maturity Model as a measurement framework: four levels (Manual, Assisted, Orchestrated, Autonomous) scored independently across six pillars (Market, Signal, Message, Motion, Mastery, Measurement), with observable descriptors per cell, a scoring protocol, and known failure modes of self-assessment. The instrument is designed so two trained assessors reach the same placement from the same evidence.

Pillar: MeasurementPillar: Mastery

The research library

One email opens every paper.

The library is read online, in full, free. We ask for your work email once: it registers you as a reader, opens every current and future paper, and subscribes you to the Briefing (one research summary per issue, unsubscribe any time).

Already a reader on this device? The papers are open below.

1. Why a maturity model, and why most fail

Maturity models earned a bad reputation honestly. Most are marketing artifacts: five flattering levels, no scoring protocol, and a vendor’s product waiting at level five. Yet the underlying instrument class is sound. The software industry’s Capability Maturity Model (Paulk et al. 1993) worked because it did three things its imitators skip: it defined observable practices per level, it scored areas independently rather than averaging them into one number, and it treated the level as a diagnosis that prescribes the next improvement, not as an award.

The Pipeline Maturity Model follows that design. Its purpose is operational: place a sales development organization on a ladder precisely enough that the placement prescribes the one move that earns the next rung.

2. The structure: four levels, six pillars, twenty-four cells

The model scores six pillars independently. A team can run Orchestrated signal infrastructure on top of Manual measurement; averaging those into one number would hide exactly the fact that matters. The four levels describe who or what carries the work and how it learns.

Pillar	1 Manual	2 Assisted	3 Orchestrated	4 Autonomous
Market	Lists by title	ICP written	Tiers by fit and timing	Tiering continuously refreshed
Signal	No triggers	Ad hoc triggers	Signal set mapped to plays	Signals fire plays automatically
Message	Generic templates	Some personalization	Signal-led, tested	AI drafts inside QA gates
Motion	Rep memory	Sequencer used	Designed multichannel motion	System executes, humans handle exceptions
Mastery	Untrained habit	Tool training only	Coached craft + AI judgment	Delegation map governs the division
Measurement	Activity counts	Basic conversion	Stage math + quality scored	Continuous, tied to revenue

Figure 1. The measurement grid. Each pillar is scored independently against the four level definitions; the team's operating level is its lowest pillar score, not its average. Descriptors abbreviated here; the full rubric carries three observable descriptors and an evidence requirement per cell. The Tenbound Standard, measurement rubric v1.

The operating level is the lowest pillar score. This is deliberate and follows from the funnel structure of the work: pipeline flows through all six pillars in sequence, so the weakest one sets the system’s effective level, the way the narrowest section sets a pipe’s flow.

3. The scoring protocol

Self-assessment inflates. The pattern is well documented (Kruger and Dunning 1999): the less developed a capability, the less able its operator is to see the gap. The protocol therefore requires evidence per cell, not opinion: a written ICP is read, not asked about; sequences are pulled from the system, not described; the qualification bar is checked against meetings that actually happened.

Three rules make placements reproducible. First, every score cites an artifact (a document, a system view, a number). Second, the assessor scores the descriptor actually met, not the one aspired to; a playbook drafted last week that nobody has run is level 1 evidence with level 3 intentions. Third, borderline cells score down, because the cost of overplacement (skipped foundations) exceeds the cost of underplacement (redundant verification).

4. The empirical pattern

Across diagnostic engagements, the recurring shape is a sawtooth: tooling pillars (Signal, Motion) score one to two levels above judgment pillars (Mastery, Measurement). Teams buy systems faster than they build the judgment to run them, which is precisely the misuse-of-automation pattern the human-factors literature predicts (Parasuraman and Riley 1997).

Market 2

Signal 3

Message 2

Motion 3

Mastery 2

Measurement 1 sets the operating level

Figure 2. The sawtooth pattern: a representative diagnostic profile, shown as a worked illustration of the most common shape we observe. Tooling pillars outrun judgment pillars; the operating level is set by Measurement at level 1. Calibrated distributions across companies publish with the annual benchmark. Illustrative profile; pattern described from Tenbound diagnostic practice.

5. Using the placement

A placement prescribes one move: raise the lowest pillar one level. Not the favorite pillar, not all six at once. The goal-setting literature (Locke and Latham 2002) is unambiguous that specific proximal targets outperform broad ambitions, and the model is built to produce exactly one specific proximal target per quarter. The level guide in the public maturity model states the canonical move per level; the diagnostic tailors it to the team’s evidence.

6. Limitations

The model measures the pipeline creation system, not product-market fit, pricing, or brand, any of which can cap results independently. Level definitions will shift as autonomous execution matures; the rubric is versioned for that reason. And the sawtooth observation, while consistent in our practice, awaits the calibrated cross-company distribution that publishes with the annual benchmark survey.