My good friend Paul Kinlan wrote a piece on what he calls The Token Salary, the idea that companies should give engineers a token budget alongside their compensation, sized at roughly 50% of their salary. He walks through two models: Steve Yegge’s version, where you cut headcount to fund tokens for the remaining engineers, and Jensen Huang’s version, where the token budget is additive. Paul’s key insight is that neither model accounts for regional salary differences. A Bay Area engineer earning $400K gets a $200K token budget. A London engineer earning $150K gets $75K. Same company, same 50% rule, wildly different AI throughput per dollar. His argument: a strong engineer in a lower-cost region paired with a larger token budget is where the market is heading.
The Same Disparity, a Different Lens
I found this article particularly interesting because of who wrote it. Paul used to work for me at Google. I was based in the Bay Area, managing engineering teams across multiple regions. I was very aware of the salary disparity and each engineer’s performance. Honestly, the math never quite made sense. The cost of an engineer rarely tracked to their output, but that is how most companies operate. Paul now does my old role at Google, still based in London. He is looking at the same disparity I once managed, but from the other side, as someone whose own compensation is shaped by it.
Value per Token, Not Volume per Dollar
This raises an old question: what is the engineering unit economic? Is it measured by cost per feature, cost per customer served, or defect economics? Salary, overhead, benefits, tooling, time spent, all weighed against what gets delivered. We have always struggled to quantify that. Now we are adding a new variable: the cost of tokens.
Paul’s framework optimizes for token volume per dollar. But volume is not value. The real question is: does one $600K engineer (400K salary + 200K tokens) outperform 2.67 engineers at $225K each (150K salary + 75K tokens)?
And a related question that nobody seems to be asking: is the relationship between token spend and output quality even linear? If you give an engineer twice the token budget, do you get twice the output? My instinct says no. A skilled engineer who knows exactly what to ask, how to decompose a problem, and when to reject a bad completion will extract disproportionately more value from every token. At some point, more tokens just means more noise to evaluate. On the other hand, a less skilled engineer could burn the same number of tokens and simply accelerate the production of bad code, deepening your company’s technical debt faster than ever.
The metric that matters is not tokens consumed but value produced per token, and that depends almost entirely on the human directing the spend. Which brings us back to San Francisco. Is the Bay Area premium paying for better judgment, or just more expensive judgment?
I think the companies that win will be the ones that learn to hire for value per token. Engineers who make AI more productive, not engineers whose mistakes AI accelerates. No leetcode problem will tell you which one you are sitting across from.