The Pipeline Maturity Model: a measurement framework for sales development organizations
Tenbound ResearchAbstract. Maturity models fail when they are badges rather than instruments. This paper specifies the Tenbound Pipeline Maturity Model as a measurement framework: four levels (Manual, Assisted, Orchestrated, Autonomous) scored independently across six pillars (Market, Signal, Message, Motion, Mastery, Measurement), with observable descriptors per cell, a scoring protocol, and known failure modes of self-assessment. The instrument is designed so two trained assessors reach the same placement from the same evidence.
One email opens every paper.
The library is read online, in full, free. We ask for your work email once: it registers you as a reader, opens every current and future paper, and subscribes you to the Briefing (one research summary per issue, unsubscribe any time).
Already a reader on this device? The papers are open below.
1. Why a maturity model, and why most fail
Maturity models earned a bad reputation honestly. Most are marketing artifacts: five flattering levels, no scoring protocol, and a vendor’s product waiting at level five. Yet the underlying instrument class is sound. The software industry’s Capability Maturity Model (Paulk et al. 1993) worked because it did three things its imitators skip: it defined observable practices per level, it scored areas independently rather than averaging them into one number, and it treated the level as a diagnosis that prescribes the next improvement, not as an award.
The Pipeline Maturity Model follows that design. Its purpose is operational: place a sales development organization on a ladder precisely enough that the placement prescribes the one move that earns the next rung.
2. The structure: four levels, six pillars, twenty-four cells
The model scores six pillars independently. A team can run Orchestrated signal infrastructure on top of Manual measurement; averaging those into one number would hide exactly the fact that matters. The four levels describe who or what carries the work and how it learns.
| Pillar | 1 Manual | 2 Assisted | 3 Orchestrated | 4 Autonomous |
|---|---|---|---|---|
| Market | Lists by title | ICP written | Tiers by fit and timing | Tiering continuously refreshed |
| Signal | No triggers | Ad hoc triggers | Signal set mapped to plays | Signals fire plays automatically |
| Message | Generic templates | Some personalization | Signal-led, tested | AI drafts inside QA gates |
| Motion | Rep memory | Sequencer used | Designed multichannel motion | System executes, humans handle exceptions |
| Mastery | Untrained habit | Tool training only | Coached craft + AI judgment | Delegation map governs the division |
| Measurement | Activity counts | Basic conversion | Stage math + quality scored | Continuous, tied to revenue |
The operating level is the lowest pillar score. This is deliberate and follows from the funnel structure of the work: pipeline flows through all six pillars in sequence, so the weakest one sets the system’s effective level, the way the narrowest section sets a pipe’s flow.
3. The scoring protocol
Self-assessment inflates. The pattern is well documented (Kruger and Dunning 1999): the less developed a capability, the less able its operator is to see the gap. The protocol therefore requires evidence per cell, not opinion: a written ICP is read, not asked about; sequences are pulled from the system, not described; the qualification bar is checked against meetings that actually happened.
Three rules make placements reproducible. First, every score cites an artifact (a document, a system view, a number). Second, the assessor scores the descriptor actually met, not the one aspired to; a playbook drafted last week that nobody has run is level 1 evidence with level 3 intentions. Third, borderline cells score down, because the cost of overplacement (skipped foundations) exceeds the cost of underplacement (redundant verification).
4. The empirical pattern
Across diagnostic engagements, the recurring shape is a sawtooth: tooling pillars (Signal, Motion) score one to two levels above judgment pillars (Mastery, Measurement). Teams buy systems faster than they build the judgment to run them, which is precisely the misuse-of-automation pattern the human-factors literature predicts (Parasuraman and Riley 1997).
5. Using the placement
A placement prescribes one move: raise the lowest pillar one level. Not the favorite pillar, not all six at once. The goal-setting literature (Locke and Latham 2002) is unambiguous that specific proximal targets outperform broad ambitions, and the model is built to produce exactly one specific proximal target per quarter. The level guide in the public maturity model states the canonical move per level; the diagnostic tailors it to the team’s evidence.
6. Limitations
The model measures the pipeline creation system, not product-market fit, pricing, or brand, any of which can cap results independently. Level definitions will shift as autonomous execution matures; the rubric is versioned for that reason. And the sawtooth observation, while consistent in our practice, awaits the calibrated cross-company distribution that publishes with the annual benchmark survey.