What Your Billing System Can't Do (Hint: Entitle)

This blog post was initially published at the Financial Engineer, a substack written by our CTO Anton Zagrebelny.

Three weeks ago, Anthropic launched Claude Cowork and wiped $285 billion off software stocks in a single day.

Thomson Reuters dropped 18%. LegalZoom cratered. Salesforce, Atlassian, HubSpot. All down. The market had a name for it within hours: the SaaSpocalypse.

The thesis was simple and brutal: if an AI agent can do the work, why pay for the software? And if you’re paying the same for a company that built better software with the same margins, why not switch? Seat-based SaaS is dead. Cancel the subscriptions. Sell the stocks.

As someone who sits at the intersection of how software companies monetize, I didn’t see an extinction event. I saw the market finally catching up to what engineers have known for a while. The subscription model was already breaking. It just took a stock crash for everyone else to notice.

And if you follow that thread to its logical conclusion, you land on a question that almost nobody in the industry is asking clearly enough: if the unit of value is no longer a seat or a subscription, what system actually controls who can do what, how much, and under what terms?

The answer is entitlements.

In last week’s post, I argued that financial engineering is becoming a first-class discipline. Entitlements are where that discipline starts. This post covers what they are, what they’re not, how to build them, and why OpenAI just published a technical blueprint that proves the point.

Why entitlements are the system that matters now

The SaaSpocalypse wasn’t an isolated panic. It was a symptom of something structural.

The smartest AI companies aren’t starting with a pricing page and three static tiers. They’re starting with services. Embedding with customers. Adapting their pricing model and contract structure continuously as they learn what value actually looks like. VCs aren’t just tolerating this anymore. They’re funding it aggressively. General Catalyst has committed $1.5 billion to acquiring service businesses and transforming them with AI. The bet is that the most defensible companies will price against outcomes and adapt contracts on the fly.

We’re watching the unit of value shift in real time. From seats to usage. From usage to credits. From credits to outcomes. Intercom ditched per-seat pricing and started charging $0.99 per AI-resolved conversation. Cursor moved from request limits to a compute credit pool. Replit went usage-based and climbed from single-digit gross margins to 20-30%. Every company that touches AI is rethinking what customers actually pay for.

Static subscriptions can’t express any of this. A subscription says “you’re on the Pro plan.” It doesn’t say “you have 50,000 credits remaining, your team burned through 12,000 yesterday, and your contract allows overage at 1.5x the base rate.” That’s an entitlement. And it’s the only part of your stack that can model the full spectrum of what a customer is allowed to do.

Not your billing system. Not your feature flags. Not your plan identifiers. Entitlements are the contract-aware access layer between your product and your revenue. The system that most companies still haven’t built properly. And the one that matters more now than it ever has.

What is an entitlement, actually?

An entitlement is a rule that defines what a customer is allowed to do in your product based on their plan, contract, or usage.

It answers questions like:

Can this customer access the AI Chatbot feature?
How many team members are they allowed to add?
Is their data retention capped at 3 days or 30?
Have they used 80% of their monthly API calls?

Think of it as a contract-aware access layer. Not just “is this feature on or off” but “what did this customer pay for, and does their current usage still fall within those terms?”

Entitlements sit between your billing system and your application. Billing handles money. Your product handles features. Entitlements are the bridge that connects the two.

Here’s a simple data structure that covers the three main types:

const entitlements = {

  // Boolean: feature is on or off

  ‘advanced-dashboards’: {},

  // Configuration: feature behaves differently per plan

  ‘data-retention-days’: {

    value: 14,

  },

  // Metered: feature has a usage limit

  ‘file-storage-mb’: {

    usage: 128,

    limit: 1024,

  },

};

Simple enough. But the devil is in how you check and enforce these at runtime.

The three types of entitlements

1. Boolean entitlements (feature gates)

The simplest kind. Does the customer have this feature or not?

function canAccessFeature(entitlements, featureId) {

  return !!entitlements[featureId];

}

if (!canAccessFeature(entitlements, ‘advanced-dashboards’)) {

  // show upgrade prompt

}

Example: “Does this account have access to Audit Logs?” Yes or no. No in-between.

Use these for features that exist in a plan or don’t. They’re the easiest to implement and the hardest to mess up.

2. Configuration entitlements

Same feature, different behavior depending on the plan.

function getFeatureValue(entitlements, featureId, defaultValue) {

  return entitlements[featureId]?.value ?? defaultValue;

}

const retentionDays = getFeatureValue(entitlements, ‘data-retention-days’, 0);

if (retentionDays > 0) {

  // set TTL on incoming data

}

Example: Free plan gets 3 days of data retention. Pro gets 30 days. Enterprise gets 90. The feature doesn’t turn on or off. It behaves differently.

These come up constantly in products where tiers offer “more” of something rather than enabling or disabling it entirely.

3. Metered entitlements (usage limits)

This is where things get interesting. And where most homegrown systems start to break.

function checkUsageLimit(entitlements, featureId, newUsage = 0) {

  const entitlement = entitlements[featureId];

  return entitlement && (entitlement.usage + newUsage <= entitlement.limit);

}

const fileSizeMb = 256;

if (checkUsageLimit(entitlements, ‘file-storage-mb’, fileSizeMb)) {

  // allow the upload

}

Metered entitlements come in two flavors:

Hard limits block the action completely. You have 5 seats. You can’t invite a 6th. Period. This is your best upsell moment, by the way.

Soft limits warn or throttle but don’t block. You’ve hit 1,000 API calls this month. We’ll send you an email and slow your rate down, but we won’t cut you off.

The tricky part isn’t the check. It’s the metering. You need to track usage in real time with enough accuracy to enforce limits, and enough speed to not add latency to every request. For hard limits, stale data means lost revenue. For soft limits, you have a bit more room to work with eventually consistent numbers.

What entitlements are NOT

This is where I see the most confusion. Teams reach for familiar tools and try to make them do entitlement work. It kind of works. Until it really doesn’t.

Not plan identifiers

This is the most common mistake, especially at early-stage companies.

// This looks simple. It’s a trap.

if (customer.plan === ‘enterprise’) {

  // grant access
}

The moment you ship your second plan version, you’re in trouble. Now you have enterprise-v1 and enterprise-v2 customers who get different features but both show up as “enterprise.” Add one custom deal and you’re checking plan === ‘enterprise-custom-acme-2024’ somewhere deep in your codebase.

I’ve seen companies with 47 plan variants scattered across their code. Nobody knows which ones are still active. Nobody wants to touch them. One company I work with at Stigg had over 100 entitlements spread across roughly 10 different systems when they started consolidating. Every new capability had become a separately metered, separately enforced limit. That’s entitlement sprawl. And it’s far more common than people realize.

The plan is how you sell. The entitlement is how your product behaves. Your code should never confuse the two.

The fix: treat plans as bundles of entitlements. Your code should never know or care what plan a customer is on. It should only ask “does this customer have access to this feature?”

Not authorization (RBAC/PBAC/ABAC)

Authorization answers: “Is this user allowed to perform this action in this system?”

Entitlements answer: “Is this customer allowed to use this capability based on what they paid for?”

Authorization is about security. Entitlements are about monetization. They operate at different levels. Authorization is per-user. Entitlements are typically per-account or per-organization.

You need both. Forcing one to do the other’s job always creates a mess. I’ve seen teams model entitlements as RBAC roles. It works until you need usage limits, plan versioning, or grandfathering. Then it falls apart fast.

Not feature flags

Feature flags are for rollout and experimentation. “Show this new button to 10% of users.” They’re meant to be temporary. You ship the feature, you clean up the flag.

Entitlements are permanent (or at least long-lived). They’re tied to contracts and money. You can’t just flip an entitlement off without breaking a customer agreement.

Some teams use LaunchDarkly or similar tools to manage entitlements. It works for simple boolean gates. But feature flag systems don’t natively support usage limits, plan versioning, metering, or the kind of deterministic contract-based behavior that entitlements require.

If your feature flagging system has become your de facto entitlement system, you’re accumulating debt that will be expensive to repay.

How these three systems work together

In practice, a request in your application might need to pass through all three checks. Here’s the order:

// 1. Feature flag: is this feature enabled for this user?

if (!isFeatureFlagEnabled(user, ‘members-page-v2’)) {

  return <Redirect to=”/legacy-members-page” />;

}

// 2. Entitlement: does this organization’s plan include this?

if (!isEntitledTo(user.orgId, ‘feature-teams’)) {

  return <Redirect to=”/pricing” />;

}

// 3. Authorization: does this user have permission?

if (!hasPermission(user, ‘members:view’)) {

  return <Redirect to=”/request-admin-access” />;

}

Feature flags control delivery. Entitlements control monetization. Authorization controls security. Each owned by different teams, serving different purposes, evaluated in sequence.

The practical tip: build a centralized access check layer (some teams put this in an API gateway) that evaluates all three in one pass and attaches the results to the request context. Downstream services shouldn’t have to fetch this information again.

OpenAI just showed you the blueprint

A few weeks ago, OpenAI published a technical post called “Beyond Rate Limits” describing how they scaled access to Codex and Sora. If you haven’t read it, you should. It’s one of the most detailed public descriptions of entitlement architecture from any company at this scale.

The core insight is what they call a “decision waterfall.” Instead of asking “is this request allowed?”, their system asks “how much is allowed, and from where?” Every request passes through a single evaluation path that checks rate limits, free tier allocations, credit balances, promotions, and enterprise entitlements in sequence. One path. One answer.

Access system: Combining real-time rate-limits and asynchronous credit & balance tracking. — Source: OpenAI Blog

This is exactly the architecture pattern I’ve been advocating for years. Rate limits, free tiers, credits, and enterprise agreements are not separate systems. They’re layers in the same entitlement stack. When the system is unified, the answer is deterministic and explainable. When it’s fragmented, different systems disagree about whether usage is valid. And that’s when trust erodes.

OpenAI made another choice worth noting. They built a distributed usage and balance system designed specifically for synchronous access decisions. They could have bolted credit handling onto an existing billing platform. Instead, they embedded it directly into the entitlement evaluation path. Credits are consumed in real time at the point of access, with billing reconciliation happening asynchronously. They explicitly chose to prioritize provable correctness, even tolerating slight balance update delays to preserve auditability and user trust.

As our CEO, Dor Sasson, noted when analyzing the OpenAI post, this architecture is significant because it articulates why entitlement-first design is not just elegant but necessary for AI products. Per-request evaluation. Deterministic enforcement. Real-time credit handling. Full auditability. These are not optimizations. They are requirements.

Not every company can build this from scratch. OpenAI has the resources to invest in a custom distributed system. But the pattern itself is universal. If you’re building AI products, your entitlement architecture should look like this. A single decision path. Credits inside the evaluation, not bolted on after. Usage events that drive an auditable chain from consumption to billing.

Lessons from the trenches

These come from years of working with teams who’ve learned them the hard way. Almost every team builds their own entitlement logic in the early days. Two plans, three features, maybe a free trial. Simple if-else logic. Ship it and move on. Here’s what goes wrong, and what to do about it.

Centralize your entitlement logic. This is the lesson OpenAI’s architecture drives home. Build a shared module or service that all your entitlement checks go through. Don’t let individual microservices implement their own version of canAccessFeature(). One decision path. One answer.

I see the opposite constantly. Each microservice does its own limit checking. Some check the database. Some check a cache. Some check nothing. Enforcement is inconsistent and nobody has a full picture.

// A React hook that wraps all entitlement checks

export const useEntitlements = (entitlements) => {

  return useMemo(() => ({

    canAccessAdvancedDashboards() {

      return !!entitlements[’advanced-dashboards’];

    },

    getDataRetentionDays() {

      return entitlements[’data-retention-days’]?.value || 0;

    },

    canInviteMembers(count) {

      const e = entitlements[’seats’];

      return e ? e.usage + count <= e.limit : false;

    },

  }), [entitlements]);

};

This gives you a single place to change entitlement logic, a consistent API across your codebase, and business-readable function names that make code reviews easier.

Don’t nest entitlement checks. If you need to check multiple entitlements together, wrap them in a single function with a clear name. The next person reading your code should understand the business intent, not the implementation plumbing.

// Bad: nested checks create invisible coupling

if (canAccess(’dashboards’)) {

  if (canAccess(’revenue-management’)) {

    // grant access

  }

}

// Better: single purpose-built check

if (canAccessRevenueDashboard(tenant.id)) {

  // grant access

}

Never assume an entitlement exists. Entitlements move between plans. Plans get deprecated. Migrations happen mid-flight. Your code should always handle the case where an entitlement is simply absent. I’ve seen production outages caused by a deploy that added a new entitlement check before the entitlement data was migrated to all customers. The check returned undefined. The feature broke for everyone.

Fail open, not closed. If your entitlement service is unreachable, what happens? If the answer is “every customer loses access to everything,” you have a problem. The safer default in most cases is to grant access when the system is uncertain. A customer getting one free hour of a feature they haven’t paid for is better than a paying customer getting locked out of their own product. Some features have direct cost implications where fail-open is dangerous. But as a general principle, err on the side of not breaking the customer experience.

Decouple entitlements from pricing changes. This is where I see the most pain. Every pricing experiment, every enterprise deal, every grandfathered plan creates a new variant. You want to change what Pro includes, but 500 existing Pro customers have the old version. Sales promises “unlimited seats plus 50,000 API calls but with Enterprise data retention.” If your entitlement logic is hardcoded, every one of these becomes a code change. Every code change becomes a deploy. A simple packaging update that should take a product manager ten minutes takes an engineering sprint. One team I work with was spending over 500 engineering hours per year on this. That’s what I call the velocity tax.

Billing systems process what happened. Entitlements decide what’s allowed to happen. That’s a fundamentally different problem.

Clean up dead entitlements. Just like feature flags, entitlements accumulate. If a feature is no longer gated behind a plan, remove the check. If a plan version is no longer active, archive it. I’ve seen codebases where half the entitlement checks reference plans that were discontinued two years ago. Nobody removes them because nobody is sure if they’re still needed. This is how you end up with code that nobody understands and everyone is afraid to touch.

Why this is only getting harder

Let me bring this back to where we started.

The SaaSpocalypse scared a lot of people. But the real story isn’t that subscriptions are dying. It’s that the unit of value is shifting underneath every software company at the same time, and most of their infrastructure can’t keep up.

We’re moving from subscriptions to usage. From usage to credits. From credits to outcomes. VCs are betting on service-based companies priced on work completed, not seats occupied. Agentic workflows trigger 2, 10, or 50 underlying operations from a single user action. Your entitlement system needs to support all of these models. Most don’t.

Your customer might be an algorithm. When an AI agent consumes your product on behalf of a human user, traditional entitlement models break. The agent doesn’t have a seat. It doesn’t have a role. But someone needs to control what it’s consuming and who pays for it. OpenAI built an entire decision waterfall to solve this. Most companies are still checking plan === ‘pro’.

Enterprise buyers demand governance. They don’t just want to know how much they spent. They want to allocate budgets by team, enforce limits by department, and get alerts before limits are hit. Every AI-native company is heading toward the same place: they need to build something like AWS Cost Explorer for their own customers. That’s an entitlement problem at its core.

The companies that figure this out will be the ones that can change pricing without changing code. The ones that can model a credit-based trial, a usage-based contract, and an outcome-based enterprise deal in the same system. The ones whose financial engineers own this layer as a first-class discipline.

Entitlements are the foundation. Everything else builds on top.

What’s next

Next week I’m going to break down credit systems for AI products. OpenAI’s decision waterfall treats credits as a first-class layer in the entitlement stack. Most companies treat them as an afterthought. If you’ve ever tried to build a credit system, you know it sounds simple and turns out to be an entire economy. Ledgers, wallets, expiration policies, concurrent deductions, revenue recognition. It’s one of the hardest infrastructure problems in modern SaaS and almost nobody talks about it openly.

If this post was useful, subscribe so you don’t miss it.

‍

Billing

Engineering

Entitlements