Are You Paying Too Much for Your Generative AI Solutions?
By John Carney
Should you build or buy GenAI? Here's a simplified formula.
One of the most common questions we hear from companies embarking on their Artificial Intelligence (AI) journey is simply, “Do I build one or buy one?” While that answer always comes down to the particulars of your business, we decided to see if we could streamline this decision process and give companies a clearer perspective. To illustrate the point, we settled on a single question: “Should businesses develop their own chat app, or just buy a subscription from one of the major vendors?”
Comparing costs is one of the most difficult aspects of the AI arms race, as each vendor offers different pricing structures, enterprise agreements are available to some (but not all) of their customers, and risk tolerances that vary. This is further complicated by the fact that many cloud vendors now offer by-the-token pricing to their customers if you’re willing to build your own frontend, reducing AI costs but requiring more up-front investment on the part of the business.
To cut through this noise, we wanted to get an apples-to-apples comparison of what you can expect to invest in AI solutions. To do this, we developed a pricing exercise using publicly available pricing information from well-known AI services to answer whether it is more accessible to build a Generative AI chat interface on your own, or whether you should vend one from a well-known AI provider. To make this useful, we compared three different example sizes:
10 Seats: Represents a small startup
100 Seats: Represents a growing small business
500 Seats: Represents a small enterprise (larger enterprises will find similar conclusions)
How did you make these comparable?
For full-service, paid service models, we leveraged available per-user pricing information and factored in annual billing discounts to ensure no money was left on the table. For by-the-token models, we made the following assumptions:
Each “seat” inputs approximately 2,000 words per week into Generative AI, or the length of a typical blog post.
The Generative AIs returned approximately the same number of words per week to the user (since cloud services charge for output tokens).
We used publicly available pricing to make annualized cost estimates.
We exclude any models like Claude Opus, which is not available for chat-based inference via API at the time of writing.
We include the costs of setting up a secure cloud host, including load balancing and domain registration, in our estimates.
We assume both initial setup costs and ongoing maintenance, with no budget for extra features to make the analysis simpler.
We evaluated text inputs and outputs only; stay tuned for a multi-modal update!
What did you find?
The major cloud providers all offer similar pricing, so you can use your provider of choice: the per-token prices are quite similar within the model tiers.
AI platforms such as OpenAI or Microsoft Copilot may be more cost-effective for small businesses and enterprises with limited use cases. They also save on startup time and its associated opportunity cost.
For medium-sized businesses, it would take three years to break even on cost by building their own, not counting development time.
For startups and small businesses, there is not time horizon where it costs more to purchase subscriptions than to build their own.
Enterprises with broad use cases can break even in as little as six months and net a possible 73% cost reduction in 18 months by developing their own frontend and paying by-the-token.
You can see our results, compared to two popular GenAI full-service offerings, in the table below (th winner at each time point is highlighted in green):
We can see the trends and break-even points in the graphs below:
What if I run on-premises instead of in the cloud?
If you run on-premises services, the picture gets more complicated. While there are on-premise deployable LLMs, you may need to weigh more than just the cost of the equipment to run it. To get you started, we have a couple of helpful starter questions to guide your evaluation:
Would using a cloud or a hybrid cloud approach work with your privacy, security, and regulatory requirements?
Are alternatives such as self-hosting freely available models such as Meta’s Llama (it’s worth noting that this may require licensure to use) viable in your on-premises model?
So, what do you recommend?
Our recommendations split into three groups:
0-100 Users: From a pure cost standpoint, it may not be worth it to develop your own interface, assuming using a vended tool does not pose regulatory, privacy, security, or resale concerns.
100-500 Users: This is most nebulous category; while it will take up to three years to recoup development cost, the flexibility and privacy of a self-rolled cloud may make it worth the effort, especially if your use case isn’t exactly “out of the box.”
Custom feature adds may increase the time to break even, depending on the overhead required to maintain them. Managed services may help ameliorate these costs.
“Out of the box” may be a better choice for more generic implementations as a savings for that overhead.
500+ Users: From the retail cost of seats alone, you will likely see savings within six months, as well as the flexibility, privacy, and security benefits outlined above, even with a substantial enterprise discount.
Managed maintenance services may help offset long-term maintenance and feature add costs.
Custom use-cases will allow for workflow acceleration at scale, but not develop ongoing.
Looking for an AI Strategy Roadmap that will actually deliver on ROI? Contact Concord!
Sign up to receive our bimonthly newsletter!
Not sure on your next step? We'd love to hear about your business challenges. No pitch. No strings attached.