Case Study: The $90,000-a-Year Azure Bill

Software Delivery Rescue & Technical Leadership

Case Study: The $90,000-a-Year Azure Bill

March 06, 2026

How a healthcare company inherited a microservice architecture, API gateway, content delivery network, and a cloud bill that didn't match the application — because the previous team had built for a scale that never arrived.

What They Were Experiencing

A healthcare technology company asked us to review their Azure infrastructure and DevOps practices. They built internal software for managing insurance claims — tracking obligations, processing documents, coordinating workflow across teams. The application was the operational backbone of the business.

The company had gone through a leadership transition. The CTO was about three months into the role. The lead developer had been there about a month. Both had inherited an application and Azure environment originally built and managed by an offshore development shop — a team that was no longer involved but whose fingerprints were on every architectural decision.

The new leadership knew something was wrong. The Azure configuration was confusing. Deployments were complicated. The monthly cloud bill seemed high for what the application actually did. But they couldn't tell where the real problems were because they were still trying to understand what they'd inherited.

They thought they needed someone to review their Azure setup and help them streamline their DevOps pipelines. What they actually needed was someone to tell them that most of their infrastructure was solving problems they didn't have.

What They Thought Was Wrong

Leadership believed the core problems were:

Their Azure configuration was messy and hard to understand
Their deployment pipelines needed to be improved
Their cloud costs might be higher than necessary
They needed to understand what they'd inherited before they could move forward

They were right about all of this. But the problems ran much deeper than a messy configuration. The entire architecture had been designed for a company that didn't exist — one with massive scale, hundreds of developers, and enterprise-grade traffic demands. The real company had a handful of developers and fewer than a hundred simultaneous users.

What I Actually Found

My colleague and I spent roughly two weeks reviewing the Azure environment, the codebase, the version control repositories, the build and release pipelines, and the database infrastructure. We interviewed the CTO, the lead developer, and other team members. Here's what emerged.

Finding #1: The Microservice Architecture That Wasn't a Microservice Architecture

The application had been built in the spirit of microservices. There were separate services for correspondence, document management, importing, reporting, user management, and general application management — each deployed as its own Azure App Service. It looked like a microservice architecture. It had the folder structure of a microservice architecture. The team talked about it as a microservice architecture.

But it wasn't actually a microservice architecture.

What microservices actually require

The most fundamental rule of microservices is independence. True independence means that each service should have its own database. Each service should communicate with other services only through messages at their service endpoints — no shared code, no compiled references between projects. That's the entire point. It's what makes the complexity of microservices worth it for the organizations that genuinely need them.

What this application actually had

This application had a single shared SQL Server database that all the "services" talked to. The services referenced each other through compiled code dependencies — when I mapped the project references in Visual Studio, it looked like a plate of spaghetti. The domain library, the common library, the repositories — they were wired together at the code level in ways that made independent deployment not just difficult but conceptually impossible.

With that kind of dependency graph, there was no such thing as changing one service. A modification to shared domain code or a common library rippled across every pseudo-microservice — they all had to be rebuilt and redeployed together. The entire premise of independently schedulable, independently deployable services was a fiction. Every change was a full-system change.

Put simply: if your services share a database, they're probably not microservices. If your services reference each other through compiled code, they're almost definitely not microservices. This application had both.

Why this was actually good news

And here's the thing — that was actually good news. Because when I looked at the projected user load, they didn't need microservices. Their simultaneous user count was in the low hundreds and unlikely to exceed four thousand anytime soon. That's not a microservice problem.

That's a single well-built ASP.NET Web API with room to spare.

The "microservice" architecture wasn't giving them any of the benefits — no independent scaling, no independent deployment, no loose coupling. It was giving them all of the costs — more complex deployments, more infrastructure to manage, more things to break, more things to pay for.

I told them to stop pretending it was a microservice architecture and to not feel bad about it. Join the service code into a single project. Simplify everything downstream. This was genuinely good news — the path forward was simpler, not harder.

Finding #2: An API Gateway to Nowhere

The application used Azure API Management — a service designed for organizations that need to expose APIs to external consumers, manage versioning, handle rate limiting, provide developer documentation, and control access at scale.

None of that was happening here. The company owned both the client application and the backend services. There were no external API consumers. The API Management instance wasn't configured for rate limiting, versioning, logging, or API mocking. It was essentially acting as a passthrough — the JavaScript client called API Management, and API Management turned around and called the public endpoint of the App Services. An extra network hop that added latency to every single request, complexity to every deployment, and cost to every monthly bill.

Worse, the API Management gateway wasn't even connected to a virtual network. It was calling the App Services at their public URLs — the same URLs anyone on the internet could use directly. The gateway that was supposedly providing a controlled access layer was just making a public call to a public endpoint with nothing in between.

When I dug into the broader Azure environment, I found the same pattern repeated. They were using Content Distribution Networks for an application that was almost entirely dynamic content — CDNs optimize delivery of static files, not dynamically generated pages. Every environment had its own CDN endpoint, adding complexity to every release pipeline. They had a Redis cache running at $67 a month that barely showed any usage. They had a VPN gateway that had transferred a grand total of 8.82 kilobytes outbound in the previous thirty days — essentially nothing.

Every one of these services was something to deploy, something to maintain, something to pay for, something to break, something to fix, and something to understand. For an application with fewer than a hundred users.

Finding #3: Two Data Centers, One Application

The production SQL Server was provisioned in Azure's "US East 2" region. The production App Services were provisioned in "US East." These are two different data centers. Microsoft doesn't publicize exact locations, but even though both are in Virginia, they're almost certainly not in the same facility.

This meant that every database call from the application was crossing a data center boundary. Every page load, every query, every save operation — all of it traveling between two physical locations when it should have been happening inside the same building. The performance penalty was real and measurable. The company was also paying for inter-region data transfer costs that simply shouldn't have existed.

This is the kind of problem that's invisible until someone looks for it. Nobody wakes up and says "I think we should put our database in a different data center from our application." It happens incrementally — someone provisions a resource and picks a region, someone else provisions another resource and picks a different region, and nobody ever reconciles them.

Finding #4: Deployment Pipelines That Weren't Pipelines

The goal of a DevOps pipeline is straightforward: build your code once, create the application artifacts, and then deploy those exact same artifacts to multiple environments with increasing confidence. Dev, then test, then staging, then production — compile once and deploy the exact same bits every time. Between each deployment, a human verifies and approves. If something fails, you stop and fix it. By the time you reach production, you know exactly what you built, exactly what you tested, and exactly what you're deploying.

That's not what was happening here.

The team had a separate build for each Git branch and a corresponding release pipeline for each branch. A build from the staging branch created a release to the staging environment. A build from the production branch created a release to the production environment. Each environment got its own freshly compiled artifacts. Nothing ever flowed from one environment to the next.

This meant the team never deployed the same artifact to more than one environment. They never built the "pipeline of confidence" where an artifact gains credibility as it passes through successive gates. There were no human approvals between environment deployments. The thing they tested in staging was literally not the same thing they deployed to production — it was recompiled from a different branch, which meant it was a different artifact entirely.

And because of how the pipelines were structured, the team couldn't take advantage of Azure App Service Deployment Slots — a feature that would have simplified their deployments and saved money. The pipeline problems, the branching problems, and the microservice problems were all tangled together, each one making the others harder to fix.

Finding #5: Five Hundred Branches and Every Password in Plain Text

The version control situation was a reflection of everything else — sprawling, confusing, and carrying hidden risks.

The client application repository had 279 branches. The backend "microservices" repository had over 260. The vast majority were old, abandoned, and never cleaned up. The team was using long-lived topic branches — some living on indefinitely — and creating branches to represent deployment environments, which is exactly the practice that was undermining their release pipelines.

But the version control problem that made me stop and underline my notes was this: every username and password for every environment — including production — was checked into Git in a shared configuration file. Every developer on the team, every contractor, every person with repository access could see the production database credentials, the API keys, the connection strings for every environment.

In Git, once something is committed, it's in the history forever. Even if you delete the file, the credentials are recoverable from the repository's history. The only real fix was to change every password and stop storing them in version control — either by starting a fresh repository or by rotating all credentials and handling environment configuration through the deployment pipeline where it belongs.

There were also no branch policies. Anyone could merge anything to any branch without code review, without approval, without a passing build. The lead developer mentioned an accidental pull request approval that had already caused problems. There were no guardrails preventing the next accident.

Finding #6: No Automated Test = No Safety Net

There were no automated tests. None. No unit tests, no integration tests, no UI automation tests. The build pipeline had a step labeled "Automated Testing" that spun up a Docker container, compiled the code, and stopped. No tests were actually executed. Nothing was published.

Without automated tests, every deployment was an act of faith. There was no way to know whether the application was working after a code change without a human manually verifying it. This made automated CI/CD not just impractical but actively dangerous — you'd be deploying code to production with no automated verification that it did what it was supposed to do.

The codebase hadn't been designed for testability either. The dependency injection framework in ASP.NET Core was barely being utilized. The Unit of Work implementation dragged around every repository in the system as a single massive object — making it impossible to isolate and test individual pieces. You couldn't easily write tests for this code even if you wanted to.

Compounding this, the database code wasn't in version control in any reliable way. There were some Entity Framework migrations in the project, but they were marked as "do not compile." When I tried to create a SQL Server Data Tools project to reverse-engineer the database schema, the database itself turned out to be invalid — it wouldn't even import cleanly. The team couldn't reliably tell you which version of the database schema was supposed to go with which version of the application code.

Without the database in version control, you can't do comprehensive automated deployments. Without automated tests, you can't validate that your deployments work. Without either one, DevOps is a word on a slide deck, not a practice.

Finding #7: Ninety Thousand Dollars a Year to Azure (and Climbing)

Across two Azure subscriptions — confusingly named "Pay-as-you-go" and "Pay-As-You-Go" with only capitalization to distinguish them — the company was spending approximately $7,500 per month. That's $90,000 a year.

The biggest single cost was SQL Server at roughly $2,600 per month. The production database alone was on an elastic pool costing about $1,500 monthly — likely over-provisioned for actual usage. There were numerous other databases scattered across the environment, many of unclear purpose, most on pricing tiers far more expensive than they needed to be.

Azure Data Factory was the second-largest expense at $1,400 per month. App Services came third at $1,340 — inflated by the "microservice" architecture that required multiple App Service Plans for services that could have lived on a single plan.

Every unnecessary service — the API Management gateway, the CDN endpoints, the Redis cache, the VPN gateway, the extra App Service Plans — was contributing to a monthly bill that was significantly higher than it needed to be. A simpler architecture wouldn't just be easier to manage. It would be measurably cheaper.

The Diagnosis

I told them something that I think came as a relief: most of what they'd inherited was unnecessary. The architecture had been built for a company that didn't exist — one with massive scale requirements, hundreds of developers, and enterprise traffic patterns. The actual company had a small team, a modest user base, and straightforward requirements.

This wasn't a criticism of the people who'd built it. Architecture decisions reflect the assumptions and ambitions of the moment. Somewhere along the way, someone had decided that this application should be built for the future — a future of explosive growth, massive concurrency, and the kind of scale that justifies microservices, API gateways, CDNs, and distributed caching. Those decisions might even have been made in good faith by smart people who genuinely believed that scale was coming.

But the scale hadn't come. And in the meantime, every one of those anticipatory decisions had created real, present-day costs: more complexity, more deployment friction, more infrastructure to maintain, more monthly spending, more things for a small team to understand and manage. The company was paying a tax on a future that hadn't arrived.

The first rule of software architecture and performance tuning is the same: don't solve problems you don't have. Only build what you know you need. The acronym is YAGNI — You Ain't Gonna Need It — and it exists because the most common architectural mistake isn't building too little. It's building too much, too early, for scenarios that may never materialize.

The path forward was subtraction, not addition. Remove the API Management gateway. Remove the CDNs. Consolidate the services into a single application. Clean up the Azure resources. Simplify the Git branching. Fix the deployment pipelines. Then — and only then — start adding the things they actually needed: automated tests, database version control, proper configuration management, and branch policies.

What I Recommended

The recommendations were sequenced to build momentum, starting with the changes that would have the most immediate impact.

Stop pretending it's a microservice architecture. Consolidate all the service code into a single ASP.NET Web API project. This one change would simplify the codebase, the project structure, the build pipeline, the release pipeline, and the Azure infrastructure. It would also make it possible to fix most of the other problems.
Remove Azure API Management. Have the client application call the Web API directly. There was no functional reason for the gateway to exist. Removing it would eliminate an unnecessary network hop, simplify deployments, and reduce costs.
Evaluate removing the CDN endpoints. Most of the application was dynamically generated content. The CDNs were adding deployment complexity without meaningful performance benefit for an application of this scale.
Move the SQL Server and App Services into the same data center. This was the single easiest performance improvement available — eliminate the inter-region latency and the unnecessary data transfer costs.
Rebuild the deployment pipelines around "build once, deploy many." One build from the main branch. Deploy the same artifact through dev, test, staging, and production with human approval gates between each environment. This would give the team actual confidence in what they were deploying.
Clean up version control. Either start a fresh Git repository (the easier path) or aggressively clean up the existing branches. Either way: stop creating branches for environments, adopt a lightweight feature-branching strategy, add branch policies, and immediately rotate every password that had been committed to the repository.
Get database code into version control. Choose a method — Entity Framework migrations or SQL Server Data Tools projects — and make the database schema a versioned, deployable artifact alongside the application code.
Start writing automated tests. Not a retroactive testing blitz — the codebase wasn't designed for it. Instead, adopt a new rule: no code gets written or modified without a failing automated test. Over time, test coverage would grow organically in exactly the places that changed most often — which are the places that need tests most.
Right-size the Azure spend. Move non-production databases to cheaper tiers. Delete resources that weren't being used. Consolidate App Service Plans. Review Azure Data Factory usage to determine whether it could be optimized or replaced with Azure Functions.

The Lesson

The previous development team had looked at a menu of Azure services and modern architecture patterns and ordered everything on it. Microservices, API gateways, content delivery networks, distributed caching, VPN gateways, elastic database pools — all the ingredients of a large-scale enterprise system, assembled for an application that served fewer people than a mid-sized restaurant on a Tuesday night.

Every one of those technology choices solves a real problem. Just not the problems this company actually had. Microservices exist to manage the organizational and technical complexity of systems with hundreds of developers and enormous traffic — not applications with a handful of developers and a few hundred users. API Management exists to govern external API access at scale — not to sit between a client and server that you own both halves of. CDNs exist to accelerate delivery of static content to globally distributed users — not to serve dynamically generated pages to a concentrated user base.

When you adopt a technology to solve a problem you don't have, you don't get the benefit — but you still pay the cost. And the cost isn't just the monthly bill. It's the cognitive load on a small team trying to understand an environment that's more complex than it needs to be. It's the deployment friction that makes every release harder than it should be. It's the debugging complexity when something goes wrong and you have to trace a request through layers of infrastructure that don't need to exist.

The new leadership at this company had inherited an architecture built on ambition rather than evidence. The hardest part of the assessment wasn't identifying what was wrong — the problems were abundant and interconnected. The hardest part was reassuring them that the right answer was to simplify. In an industry that rewards complexity and treats "enterprise-grade" as a compliment, telling a team that they should do less — fewer services, fewer layers, fewer tools — can feel counterintuitive. But for this company, at this scale, with this team, simpler was better in every dimension: faster to deploy, easier to understand, cheaper to run, and more reliable to operate.

The decisions that created this situation weren't malicious. They were aspirational. Someone looked at the future they hoped the company would have and built for that future instead of the present they were actually living in. It's a natural mistake — and a common one. The antidote is a question that should be asked before every architectural decision: do we have this problem today, or are we solving it in advance? And if we're solving it in advance, what's the cost of being wrong?

If your architecture was designed for a scale you haven't reached — if your cloud bill seems high for what the application actually does — if your deployments are complex because of infrastructure layers that don't seem to serve a clear purpose — if your team spends more time managing the environment than building the product — you might not have a technology problem. You might have a complexity problem.

That's the kind of problem I help companies see clearly.

Categories: case-studies

Tags: case-studies