Four years ago, we started building Stigg because we believed the way software companies manage pricing and entitlements was fundamentally broken. Engineers were burning months building homegrown billing logic instead of shipping product. Every pricing change was a deployment. Every enterprise deal was a custom integration.
We were right about the problem. But the world changed faster than anyone expected - and the change revealed something we didn't fully appreciate at the start: entitlements aren't just a billing convenience. They are becoming the most critical infrastructure abstraction in the AI economy.
Today, we're announcing Stigg 2.0 - the most significant release in our company's history. Before I walk through what we built, I want to explain why we built it. Because the "why" is the whole story.
The Smartest Companies in AI Are Building In-House. That Should Terrify You.
There's a trend happening right now that almost nobody is talking about publicly. The most technically sophisticated companies in the world - OpenAI, Anthropic, the frontier labs, the companies defining what AI products look like - are building their own billing and access control infrastructure from scratch.
The Head of Financial Engineering from one of the largest AI Frontier told us directly: "What we really needed was something that was close to real time, if not real time, that could tell us - do you have credits or not?" She said they evaluated every third-party metering and billing platform on the market. None of them could make synchronous access decisions. Most were built for a different era - aggregate usage over the month, send an invoice at the end. That model doesn't work when a single API call can cost dollars, when agents spawn sub-agents in milliseconds, and when a fraudulent user can burn through thousands of credits before your batch job runs.
In February 2026, OpenAI published "Beyond Rate Limits" - the most architecturally revealing piece of writing any AI company has released about its billing internals. The key concept is a "decision waterfall": instead of asking "is this request allowed?", their system asks "how much is allowed, and from where?" Every request passes through a single evaluation path that synchronously checks rate limits, verifies credits, and returns one definitive outcome. Credit debits settle asynchronously. Rate limits, free tiers, credits, promotions, and enterprise entitlements are all layers in the same decision stack.
We read that post and recognized our own architecture.
But here's what should concern every engineering team reading this: OpenAI can afford to build this. They have hundreds of engineers and the revenue to justify a dedicated team. Most companies don't. And the "we can build a credit counter in a weekend" pitch is one of the most dangerous lies in software. Yes, you can build a counter in a weekend. Two months later, you'll have a system that handles 30% of the edge cases - and the other 70% will show up as billing disputes, revenue leakage, angry enterprise customers, and 3am pages.
The build-in-house trend isn't happening because companies want to build. It's happening because existing billing platforms failed them. Incumbents solutions were designed for a world where subscriptions renew monthly, usage gets aggregated into line items, and the invoice is the moment of truth. In AI, the moment of truth is the API call. The request. The inference. The agent action. If you can't decide in real time whether that request should proceed, you've already lost - either to fraud, to budget overruns, or to a customer experience that breaks trust.
In AI, the moment of truth is the API call. The request. The inference. The agent action. If you can't decide in real time whether that request should proceed, you've already lost - either to fraud, to budget overruns, or to a customer experience that breaks trust.
Stigg 2.0 exists to make the build-in-house path unnecessary. The architecture OpenAI built internally - as a product. For every AI company.
Why Entitlements Are the Abstraction That Changes Everything
If "billing" is the system of record for what was sold, entitlements are the system of record for what was fulfilled. That distinction is about to become the most important one in enterprise software. Here's why.
AI companies ship features faster than commerce can keep up. In 2025 alone, we tracked 1,800 pricing changes across 500 companies. OpenAI, Anthropic, Cursor, Figma, Clay, Monday.com, HubSpot, Notion, Slack - one by one, they moved to credits, tokens, and outcome-based pricing. The seat is dying. But the speed at which AI products evolve means pricing can't be a deployment anymore. When your team ships a new model, a new agent capability, or a new feature every week, the entitlements layer - what each customer is allowed to do with your product - has to move just as fast. Hardcoding that logic into your application is a choice to make every product launch a billing project. An entitlements layer that's programmable, externalized, and instant turns pricing changes from multi-week engineering projects into configuration changes that take minutes.
Agentic usage demands milliseconds-latency enforcement. When a human clicks a button, you have hundreds of milliseconds to check their permissions. When an AI agent spawns a sub-agent that makes 50 API calls in parallel, each consuming tokens from a shared credit pool, you have milliseconds - and the answer has to be correct. Not eventually consistent. Not "we'll reconcile later." Correct, right now, before the expensive thing happens. Because in the age of agents, "we'll reconcile it later" is just another way of saying "we'll eat the cost or surprise the customer." Neither is a strategy. This is why we obsess over millisecond enforcement in the request path. It's not a performance flex. It's the only architecture that works when agents are the users.
Selling to the Enterprise demands governance and controls. We hear it in every conversation. A large Enterprise told us they had to set entitlement limits higher than what customers actually purchased - just to avoid breaking the customer relationship - because they had no system to enforce the real limits dynamically. Cursor had a user at an enterprise customer who burned down the entire organization's AI credit allocation in a single day. Enterprise buyers aren't asking "do you have a nice dashboard?" They're asking: "Can my teams set their own budgets? Can I attribute AI costs to departments? Can I cap a runaway agent before it drains $50,000 in inference?" Every one of those questions is an entitlements question, not a billing question. And increasingly, these aren't nice-to-haves - they're deal requirements. Enterprises won't sign six-figure AI contracts without governance controls. As one of our customers' product leads put it: "More and more enterprises today are looking for the ability to self-govern how what they purchased is actually being used."
SOX compliance demands auditability on what was fulfilled, not just what was sold. This is the one nobody is talking about yet, but it's coming fast. SOX requires that companies demonstrate controls over financial reporting - and when your revenue depends on usage-based consumption, the gap between "what the contract says" and "what the customer actually received" becomes an audit surface. When another team evaluated us, the question wasn't about invoices - it was about entitlement change tracking: "From a revrec perspective, we need to know when customers' feature sets were changed to recognize revenue properly." Traditional billing systems track what was billed. Entitlements systems track what was delivered. As AI companies mature, go public, and face audit scrutiny, the entitlements layer becomes the provenance chain that connects the contract to the product experience to the revenue line. If you can't prove what was fulfilled, you can't recognize the revenue. Revenue-impacting entitlements need the same level of auditability that SOC 1 Type 2 demands - which is exactly why Stigg holds that certification.
These four forces - the failure of existing billing platforms, the speed of AI product evolution, the real-time demands of agentic architectures, and the governance and compliance requirements of enterprise buyers - are converging right now. They all point to the same conclusion: the entitlements layer is no longer a feature of your billing system. It's a separate, foundational piece of infrastructure. And it needs to be built for the world we're entering, not the world we're leaving behind.
The entitlements layer is no longer a feature of your billing system. It's a separate, foundational piece of infrastructure. And it needs to be built for the world we're entering, not the world we're leaving behind.
What Stigg 2.0 Is
Stigg 2.0 is the usage runtime for AI products. Not a billing platform. Not a metering tool. The real-time enforcement and governance layer that sits between your application and your billing stack, deciding what every customer, user, team, and agent is allowed to do - at the moment they try to do it.
One API call. One real-time decision. One definitive answer. In milliseconds. Every time.
We enforce. Others just report.
Incumbents tell you what happened after the billing cycle closes. Stigg decides what's allowed to happen before the request completes. That's the difference between a dashboard and a runtime.
Here's everything we're shipping today.
1. Credits Engine - Rebuilt from the Ground Up
Every AI product runs on credits now. But credits are deceptively hard. Wallets, ledgers, burn-downs, roll-overs, top-ups, expirations, priority-based consumption, multi-currency pools, promotional grants, recurring allocations - the edge cases multiply fast.
We rebuilt our entire credits infrastructure, now as a financial transactions database designed for exactly this kind of work. The result:
Real-time balances. Not eventually consistent. Not updated on a cron job. When your user spends a credit, their balance updates before the API response returns.
Zero overdraft. Hard and Soft enforcement at the wallet level. If a user has 12 credits left and the action costs 15, the request is denied - not billed retroactively.
Priority-based consumption. Promotional credits burn first. Paid credits burn last. Expiring credits burn before non-expiring ones. You define the rules. We execute them atomically.
Wallet-per-resource isolation. Separate credit pools per subscription, per product, per team. No cross-resource bleed. One customer's AI assistant doesn't drain another product's budget.
ASC 606-compliant ledger. Every credit transaction is recorded with full provenance. Filter by time, event type, actor, or dimension. Your finance team gets the audit trail they need without building it.
Standalone grants. Give credits to a customer without requiring a subscription. Promotional credits, partner allocations, manual top-ups - all through the API, all with the same ledger integrity.
Reserve & settle. Solving Credit monetization for long-running agents, where the consumption rate is unknown in advance.
Credits visibility. Alerting, usage breakdown by the lowest granularity (any dimension).
2. Governance - AI Usage Budgets and Allocations That Actually Enforce
This is the feature set we're most proud of, and the one that doesn't exist anywhere else.
Enterprise customers buying AI products today have a problem nobody talks about openly: a single power user can consume an entire organization's AI allocation in days. By the time anyone notices, the budget is gone.
Stigg Governance is the first milliseconds-latency, high-cardinality usage control layer for AI products. It lets your enterprise customers set budgets, limits, and alerts across every dimension of their organization - and enforces them at the moment of consumption, not on the invoice.
Any dimension. Users. Teams. Departments. Organizations. Sites. Regions. Applications. Workspaces. Agents. Multiple agents running in parallel. You define the hierarchy. We enforce it.
Real-time decisioning. When a user or agent makes a request, Stigg evaluates their entitlements, checks their budget, verifies their limits, and returns a decision - all in under 5 milliseconds. The same evaluation that OpenAI built into their decision waterfall, available to every AI company through a single API.
Cost attribution. Know exactly which team, product, feature, and model is consuming AI budget. Not as a monthly report - as a live, queryable signal.
Model-level controls. Cap GPT-4 usage while leaving GPT-3.5 unlimited. Throttle expensive inference without blocking lightweight operations.
Embedded governance portal. Budget dashboards, alert configuration, limit management - all self-service. Turns "we need usage controls" from a sales blocker into a selling point.
3. Usage Metering - Built for Scale, Built for Correctness
Metering is the foundation everything else sits on. If your event counts are wrong, your credits are wrong, your invoices are wrong, and your customers lose trust.
We built a managed metering pipeline from scratch. Designed for scale with cloud, self-host and BYOC deployments available for customers and capable of processing over 1m+ events per second.
99.9% of events ingested to enforcement in seconds - not minutes, not hours.
Exactly-once semantics where it matters. Checkpointed. Recoverable. Reconcilable.
Dimensional aggregation. Slice usage by any combination of reported dimensions - model, feature, region, team, time window. Powers the analytics your finance and product teams need.
External bucket ingestion. Already generating usage data in your data warehouse? Point us at the bucket. We'll ingest, aggregate, and enforce.
4. BYOC - The First Architecture That Scales to Infinity at Milliseconds Latency
Every AI conversation eventually lands on the same question: "Can this run in our VPC?"
The volume is too large. The data is too sensitive. The latency budget is too tight.
Stigg 2.0 is the only solution to ship a modular BYOC architecture where each component deploys independently into your cloud - and the whole system delivers the same milliseconds-latency, infinite-scale guarantees whether you run one module or all four.
Pick what you need. Deploy where you need it.
You can bring any single module into your VPC on its own - or combine them:
Metering - high-throughput event ingestion and aggregation running inside your VPC, processing dozens of millions of events per second without a single byte leaving your network.
Credits Engine - TigerBeetle-backed wallets with real-time balance enforcement, deployed as a standalone service in your account. Provably correct. ASC 606-ready.
Governance - the milliseconds-latency decision engine that evaluates entitlements, budgets, and limits on every request. Sub-10ms P99. Runs as a sidecar or a dedicated service alongside your application.
Entitlements - the configuration and enforcement layer that defines what every customer, user, and agent is allowed to do. Programmable. Instant propagation.
All four together - the complete usage runtime. Metering feeds Credits, Credits inform Governance, Governance enforces Entitlements. One integrated system, fully deployed in your infrastructure, with the same APIs and SDKs as our managed cloud.
This isn't a "lite version" of Stigg that happens to run on-prem. It's the same production system. The architecture is EKS-native:topology-aware scheduling to minimize cross-AZ costs. We're not abstracting away the infrastructure - we're giving you production-grade ops from day one.
For the technical teams evaluating this: the modular boundary is real. There's no hidden coupling that forces you to deploy everything. You can run our Governance engine in your VPC while keeping Credits on our managed cloud - or vice versa. The trust boundaries are clean, the data ownership is explicit, and the deployment topology is yours to decide.
No other vendor in this space offers this. Not modular. Not milliseconds-latency. Not at this scale. And certainly not with the option to bring each piece independently into your own infrastructure.
5. From Prompt to Production - Stigg Goes Agentic
We killed our old UI-first philosophy. If a capability doesn't ship as an API endpoint and CLI command first, it doesn't ship. The dashboard is documentation for the API, not the other way around.
Agent-ready. Configure your credit system, manage your product catalog, and deploy to production through natural language, directly from your terminal. With Stigg Skills and MCP server, engineers can go from “I need a credit system” to production in a single session. AI agents can also interact directly with billing infrastructure through Stigg MCP, querying customers, checking balances and subscriptions, and turning billing data into actionable insights, summaries, and account-level analysis.
CLI-first operations. Every operation you used to be done in our dashboard, you can do from stigg-cli. Automate deployments. Script pricing changes. Integrate with your CI/CD pipeline.
Data import from anywhere. Migrating from a homegrown system? We built native integration for bulk data import - subscriptions, customers, usage history, credit balances. Connect your database, map your schema, and migrate safely. No more months-long CSV wrangling.
6. Invoicing & Contract Management - Powered by Received.ai
We acquired Received.ai and integrated their invoicing and contract management engine directly into Stigg.
For AI startups, this means you can go from meters and entitlements to a professional invoice without adding another vendor. Set up your pricing, connect your payment processor, and Stigg handles the rest.
For enterprise SaaS, it means bespoke multi-year contracts with custom pricing formulas, tiered commitments, and complex billing schedules - managed in one system alongside your entitlements and credits.
This is a toggle-on module, not a separate product. If you're happy with Stripe or Zuora for invoicing, keep them. But if you want the entire stack in one place - especially for sales-led deals with non-standard terms - it's there.
7. New Pricing - Built for the World We're Entering
We're retiring seats and subscriptions as our billing model. Stigg 2.0 prices on two things: managed entities, the customers, users, agents, and teams that Stigg enforces for, and usage events ingested for metering. That's it.
No per-seat fees. No subscription surcharges. No pricing dimensions that punish you for caching, optimizing, or building efficient architecture. The full credits engine, entitlements engine, governance, and invoicing are available on every plan, including free.
But the real pricing disruption is BYOC. Every usage-based billing vendor today charges you per event over the cloud - your most sensitive usage data leaving your network, hitting their servers, and coming back as an invoice. You pay for the transit, you accept the latency, you inherit their compliance posture, and when you hit scale, you hit their cost curve too. With Stigg BYOC, your metering runs on your infrastructure. Events never leave your VPC. There is no per-event billing. Stigg charges a flat deployment fee and committed entity pricing. That means the more usage events you process, the more value you extract from Stigg, and your bill doesn't change. We believe this is how all infrastructure metering will work within two years. We're shipping it now.
Our pricing starts at free (Build, 10,000 entities), scales through self-serve (Pro, $399/mo), and extends to committed contracts (Scale) and full BYOC deployment (BYOC). Every tier, every volume discount, and every rate is published at stigg.io/pricing. Our new pricing is available today in early preview. GA is at the end of September 2025.
Building the Future of AI, Together
We didn’t build this infrastructure in a vacuum. It was shaped by every engineer, designer, and operator on our team, and battle-tested by fast-growing customers who trusted us with their production traffic. We built it because we saw firsthand how traditional systems were breaking under the weight of the AI era, leaving builders to stitch together solutions that should just work.
If you are an engineer, product leader, or founder, you are likely navigating how to monetize, scale, and govern your AI products right now. You shouldn't have to build low-latency entitlements and credit infrastructure from scratch. We built this platform so you can focus on building your core AI innovation.
We’re launching this at the AI World Fair because we want the builders in the room to judge our work on its merits. Come find us, break our demos, read our docs, and deploy the SDK. We built this for you.