The wrong unit of measurement

On hourly billing, fixed-fee projects, SaaS subscriptions, and what becomes possible once AI cost structures eliminate the bootstrap problem agencies have always tripped on.

5 May 2026 · 16 min read

Service-business pricing has three dominant shapes. We bill by the hour, we bill by the project, or we bill by the seat. Each shape solves a real operational problem — predictability for the buyer, predictability for the seller, predictability for the accounting close at month-end — and each, in solving its problem, structurally misaligns the seller's incentives with the buyer's outcomes. The misalignments are different in each case, and the literature has been pointing at them for forty years.

The literature has also been pointing at an alternative. Nassim Nicholas Taleb's Skin in the Game (Random House, 2018) is the surface-level statement of it; the agency-cost work of Michael Jensen and William Meckling (1976), the moral-hazard-in-teams paper Bengt Holmström wrote in 1979, George Akerlof and Robert Shiller's Phishing for Phools (Princeton, 2015), Barry Nalebuff's Split the Pie (Harper Business, 2022), and Eliyahu Goldratt's The Goal (North River Press, 1984) sit underneath it. The books and papers do not agree on everything, but they converge on a single structural diagnosis. Pay the agent for what the principal can verify and cares about. Do not pay them for inputs. Do not pay them for promises. Do not pay them for the right to use software whose surplus the buyer cannot measure. Pay them for the surplus, when it shows up, and split the surplus.

What is new is that this fourth shape, which the literature has supported since at least the seventies, has been impractical for small agencies the entire time. To run a workflow on a value-share basis, an agency had to carry the build cost upfront — engineering-team-weeks of pre-revenue delivery against a future stream that may or may not arrive. The standard answer was a refundable advance, recoverable from the customer's share. The advance is, in disguise, a fixed-fee implementation: the agency is paid for delivery, not outcome, and the misalignment the rest of the structure was supposed to fix is back in the room.

The cost structure of agentic AI workflows changes the arithmetic. The agency that runs the same orchestration loops it ships does not have a bootstrap problem — its existence already depends on those loops working. The marginal build cost of a customer-specific workflow is operator-hours, not engineering-team-weeks. The bootstrap problem the literature tripped on for forty years dissolves at the new cost structure, and the alternative the literature has been pointing at the whole time becomes available — at agency scale, without the advance, without the disguise.

Why hourly billing pays for the wrong thing

The most useful framing of what is wrong with hourly billing is Taleb's. Taleb's claim in Skin in the Game is structural, not moral: an advisor who gets paid the same whether the recommendation works or fails is selecting for plausible-sounding output, not for output that survives contact with reality. The advisor whose payoff does not depend on whether they were right has no skin in the game; the advisor with no skin in the game cannot, by construction, be in the game. Taleb spends most of the book on the political and philosophical consequences of this asymmetry, but the operational consequence in service work is direct.

The hourly-billed consultant is paid for inputs — for hours occupied, sometimes for hours present. The unit of measurement is the agent's time, and the agent's time is the agent's own resource; selling it more cheaply or in greater quantities is the agent's lever to pull, not the principal's. The principal who hired the consultant wanted an outcome; what they get to verify is occupancy. Jensen and Meckling, in their 1976 “Theory of the Firm: Managerial Behavior, Agency Costs and Ownership Structure” paper in the Journal of Financial Economics, named this gap agency cost — the wedge that opens between what the principal would pay for an output they could verify and what they have to pay when the only thing they can verify is the agent's effort. The wedge is borne by the principal because the agent has the better information.

The structural argument matters here, not the moral one. Many hourly-billed consultants are diligent, conscientious people who would never let a recommendation degrade unmonitored; the point is that the structure invites the failure mode and selects who stays in the industry over time. An agent paid the same whether the work survives is competing on a different axis from one paid only when the work survives, and over a long enough horizon the two axes pull in different directions. Holmström's 1979 paper “Moral Hazard and Observability” in the Bell Journal of Economics formalised the result: when effort is unobservable and outcomes are noisy, a contract that does not condition pay on outcomes will produce less effort than a contract that does, even when the people involved are the same. The literature is forty years old.

Why fixed-fee implementations front-load risk on the wrong party

The standard alternative to hourly is to fix the fee. The buyer agrees to pay a defined sum for a defined deliverable; the seller commits to ship that deliverable for that sum. This shape solves Taleb's problem at the moment of contract — the seller is exposed to overrun — and reintroduces it the moment the deliverable ships. After delivery, no further exposure. After delivery, no further skin.

This is the failure mode Akerlof and Shiller catalogue in Phishing for Phools. Their thesis is that when sellers' incentives point away from buyers' outcomes, markets reliably produce products whose appearance optimises for the seller's payoff at the cost of the buyer's welfare. Their health-insurance chapter is the cleanest worked example: insurers compete on premium price at the moment of sale, where the buyer can comparison-shop, while the actual product — claims paid when claims are filed — is not visible until years after purchase, by which time the buyer is locked in and switching is costly. The market clears on the visible margin (premium) and degrades on the invisible margin (denial rates, network erosion, prior-authorisation friction); buyers reliably end up with policies that look good at signing and produce worse outcomes than the price would suggest. The structural pattern is the same one that recurs in technology consulting: a fixed-fee implementation followed by silent degradation. The workflow stops producing because nobody is paid to maintain it. The customer holds the entire operating-tail risk. The seller, who got paid in full at delivery, has no incentive to notice that the workflow has degraded; if the customer notices, the seller will quote a fresh fixed-fee for the maintenance work. This is not bad faith. This is what the structure rewards.

There is a genuine exception. Some implementations really are one-time — an integration build with a defined cutover, a migration with no operating tail, a piece of work whose value lands at delivery and not afterwards. Fixed-fee is the right shape for those engagements. The mistake is to extend the shape to engagements where the value does have an operating tail, where the workflow has to keep producing for the value to be real. In those cases the fixed fee is the seller's exit; the customer's exposure begins where the seller's ends.

Why subscription decouples revenue from delivery

The third shape is the SaaS subscription. The price is a function of seat count, or feature tier, or storage, or compute — not a function of the surplus the software creates. This is the shape that rules contemporary software, and it is the shape least visible as a misalignment because the buyer has agreed to it under the framing of low monthly cost rather than ongoing accountability.

The structural pathology is invisible-degradation by construction. The customer who keeps paying for a workflow that has quietly degraded has no signal that anything is wrong; the seller who keeps collecting MRR has no incentive to notice. By the time the customer churns, the workflow has been failing for months. The seller, looking at retention curves, sees a churn event and runs a win-back campaign; the customer, looking at the workflow, sees a thing that stopped working without ever telling them. Neither party has a unit of measurement that surfaces the gap between paying for something and getting something.

The subscription model also dilutes Nalebuff's point about surplus. The price is set by what the market will bear at the seat level — usually some compromise between the customer's willingness to pay and the seller's competitive set — and it is fixed across customers regardless of how much surplus the software creates for any particular one. A buyer for whom the software creates a hundred dollars of surplus per month pays the same as a buyer for whom it creates a thousand. The first buyer is overpaying; the second is underpaying. Neither price is a function of the value created. This is not an accident. The model exists because pricing on surplus is hard and pricing on seats is easy, and the easy answer wins until the difficulty of the right answer comes down.

The fourth shape: split the surplus

Nalebuff's Split the Pie makes a simple, principled claim. In a two-party negotiation, the only equitable split is fifty-fifty — but not of the gross outcome. Of the surplus: the portion of the joint outcome that exceeds the sum of each party's best alternative without the deal. If the customer's best alternative without us is some downstream profit of X, and the joint outcome with us is Y where Y exceeds X, then the pie is the difference, and an equal split of the pie means each party walks away with what they could already capture without the other, plus half of what the deal jointly creates.

The genius of the framing is that it dissolves the question “is the price fair” into a measurement question. Fair is not a number. Fair is the symmetry of the split given the surplus. The customer cannot argue that the price is too high without arguing that the surplus is smaller than agreed; the seller cannot argue that the price is too low without arguing that the surplus is larger than agreed. Disagreement collapses to measurement, and measurement is tractable in a way that disagreement about fairness never is.

This sits on top of Holmström's structural result: when outcomes are observable but effort is not, the contract that maximises joint surplus pegs the agent's pay to the outcome at the margin. In the simplest two-party case where the principal and agent are jointly working on a single observable surplus, the agent's share at the margin should be such that the agent's incentive to add a marginal unit of effort matches the principal's incentive to receive that unit of effort — which, when the surplus is bilateral and both parties have outside options, points toward an equal split. Holmström's 1979 result is a positive theorem about effort under unobservability; Nalebuff's split-the-pie is a normative claim about equitable bargaining in the bilateral case. The two are not the same theorem — they sit on different sides of the positive/normative line — but they point at the same answer when the surplus is bilateral and observable: peg the agent's pay to the surplus at the margin, split it equally.

And then Goldratt. The Goal is not a paper on pricing — Goldratt's 1984 management novel is about manufacturing throughput — but it is the source of the “throughput, not local efficiency” reframe that the rest of the argument rests on. Goldratt's point, restated for service work, is that efficiency at any individual workstation is irrelevant if the workstation is not the bottleneck. The unit of measurement that matters is the rate at which the system as a whole produces what the customer pays for. Local efficiency — how busy any particular agent looks — is, at best, decorative; at worst, it is what produces inventory pile-up at the next stage. The pricing question and the throughput question turn out to be the same question. Pay for throughput. Define throughput as a unit the customer can verify. Split the surplus that throughput produces.

The structure of the alternative is: identify the joint outcome with the workflow running, identify the customer's best alternative without it, take the difference as the per-unit surplus, split the surplus fifty-fifty per successful unit produced. The mechanism is not the essay's subject — mechanism details are an engagement-specific question and an essay that litigates the mechanism is doing a different job from the one this essay is doing — but the shape of the answer is what the literature has been pointing at the whole time. The agency's revenue is exactly half the surplus the workflow creates, summed across units produced. There is no upfront fee. There is no monthly retainer. There is no per-seat charge. The agency's revenue depends on the workflow continuing to produce; if it stops producing, the revenue stops with it.

The bootstrap problem the literature has always tripped on, and what changed

The textbook objection to per-successful-output pricing is the bootstrap. Agencies carry delivery cost upfront, and weeks of team time at risk against a future revenue stream that may or may not arrive is real exposure. The standard answer used to be a refundable delivery advance, recoverable from the customer's share. The advance is, in disguise, a fixed-fee implementation: the agency is paid for delivery, not outcome, and the misalignment the rest of the structure was supposed to fix is back in the room. This is the trap every previous attempt at value-pricing in services has fallen into.

What changed is the cost structure of building. An agency whose build is AI-dogfooded from its own operations product has a marginal build cost per engagement that is operator-hours, not engineering-team-weeks. The build is part of how the agency runs as an agency — the same orchestration loops the agency uses internally are the loops it ships externally — and the ongoing operational dependence the agency has on those loops working is the structural answer to Taleb's question of where the skin is. The agency that ships a workflow on a value-share basis and then watches it degrade is, in the AI-dogfooded case, an agency whose own internal operations have degraded; the failure is felt on both sides of the contract, by construction. As the operator put it on the day this was written:

My commitment is to dogfood my own product really. This repo is evidence of that.

That is not a positioning claim. It is the conclusion the cost structure forces. The advance was always a workaround for the fact that an agency's operating dependence on its own product was abstract — the agency could survive a customer's workflow degrading, because the agency's revenue came from many customers and its internal operations did not depend on any one workflow working. At the new cost structure that is no longer true. The same loops that produce the customer's deliverables produce the agency's; the agency cannot route around its own product's failure. The dogfooding is structural, not aspirational, and the bootstrap problem disappears with it.

Drucker's caveat: the wrong work done well

None of this dissolves the harder question that sits before the pricing question. Peter Drucker, in “Managing for Business Effectiveness” in the Harvard Business Review in May 1963, gave the field its sharpest line on this: there is nothing so useless as doing efficiently that which should not be done at all. The phrase is widely misattributed to The Effective Executive (1967), which restated the idea, but the canonical source is the 1963 article, and the surrounding paragraphs are a sustained argument that effectiveness — doing the right thing — categorically precedes efficiency — doing things right.

The point in the context of this essay is that no value-share mechanism can fix on its own the problem of a workflow that should not exist. A workflow can be successful by every per-unit metric — units produced, surplus calculated, share paid — while the underlying service line is the wrong thing. Revenue from a product the market should reject. Throughput of busywork that competitors are about to disrupt. A workflow that produces more documents nobody reads, faster. The mechanism prices what is being produced; it cannot price whether that thing should be produced.

The gate question must therefore come before the pricing question. If doubling the output moves the customer's profit and loss by a calculable multiple, the unit being produced is the customer's profit lever, and a value-share over surplus is a coherent thing to price. If doubling the output produces unused capacity, queued work, or noise, the unit is detached from the lever; no pricing mechanism, however well-aligned, will bridge the detachment. The engagement-decision precedes the pricing-decision; both halves have to be right.

The first time we faced this gate was in the engagement that became Chattel Valuations NZ. The customer arrived with a depreciation spreadsheet that needed correcting — a few days of agency time, a fixed fee, a cleanup. The narrow ask passed no part of the value-share gate: the spreadsheet did not recur, there was no per-unit surplus for the mechanism to bind to, the fix would land and walk away. The accept-the-narrow-ask path led directly to the fixed-fee pathology; the decline-the-narrow-ask path led to no engagement at all. The path that turned out to be available was a third one: surface a wider hypothesis where the throughput-doubles-profit-doubles relationship holds. The wider hypothesis we proposed: a workflow whose per-report production time would be the customer's profit lever, where doubling throughput would translate directly to revenue against fixed delivery capacity. At that level the unit of work — one report sold — was simultaneously the workflow's output and the customer's revenue lever. The gate answered yes, and the engagement proceeded as a value-share over surplus. Without the reframe we would not have engaged on a value-share basis. We would have declined — and the customer's underlying problem, which was not the spreadsheet but the production rate, would have remained unsolved by anyone.

The reframe is itself the first skin-in-the-game move of any engagement. Saying “no, your stated scope will not move your business; here is what would” is an exposed claim. It requires knowing which lever moves the customer's profit and publicly staking a position on it. Vendors who accept whatever scope they are handed are not agents in the Taleb sense; they are order-takers. The reframe is where we make the first exposed claim of any engagement, before the mechanism has anything to price.

A wider frame

The same misalignment pattern shows up well beyond agency-customer engagements. Legal billing by the hour selects for hours billed, not for case outcomes; the lawyer who finishes the brief in two hours instead of eight is paid less than the one who reaches the same outcome in a quarter of the time. The structure penalises the very efficiency the client is paying for. Management consulting on a fixed-fee selects for the deliverable shape over the implementation; the strategy deck lands, the consultant leaves, and whether the strategy survives contact with the organisation is no longer the consultant's problem. Salaried senior engineers paid the same whether their judgement was right are insulated from the consequences of the architecture decisions they make; the architectural drift compounds across teams who do not bear the maintenance cost of it.

The diagnostic frame is portable. Wherever the seller is paid for inputs the buyer cannot verify, or for promises the buyer cannot enforce, or for the right to use a thing whose surplus is invisible to both parties, the structure is selecting for plausible-sounding output over output that survives. The literature has been pointing at this for forty years. The cost structure of AI-dogfooded agencies makes the alternative practical for one corner of the economy. There are other corners.