Cosmos DB: Querying Cosmos — LINQ, SQL, and Knowing What Runs

Software Delivery Rescue & Technical Leadership

Cosmos DB: Querying Cosmos — LINQ, SQL, and Knowing What Runs

April 22, 2026

This is Chapter 11 of Azure Cosmos DB for .NET Developers. Previous: Chapter 10: Designing a Multi-Entity App.

Chapter 7 showed you what queries cost. This chapter shows you what queries are.

In Chapter 7 we ran a test harness, measured RU charges for every operation, and came back with hard numbers. Point reads cost 1 RU. A cross-partition scan of 1,000 documents costs... well, it depends. We also introduced the idea of LINQ's IQueryable and deferred execution. Let's apply some of that thinking to a real app.

Now that you've seen the cocktail app's design in Chapter 10 — three aggregate roots, one container, hierarchical partition keys — we can talk about the query layer that runs on top of it. Specifically: LINQ vs. raw Cosmos SQL, when to reach for which, how the Cosmos LINQ provider actually translates your code into SQL, what "deferred execution" means when there's a real database on the other end of the wire, the category of bug that happens when your LINQ bypasses the library, and how ICosmosQueryLogSink gives you structured observability over the whole thing.

What's ICosmosQueryLogSink?

Before we get into the actual content of this chapter, I should probably talk about how I'm getting my timing numbers, RU counts, and SQL on each of my calls to Cosmos DB. In my Benday.CosmosDb library, I have an optional diagnostic setting that allows you to log details of the queries that run through the library. If you run your data operations through a CosmosRepository<T> repository, under the surface, that repository takes a parameter of type Benday.CosmosDb.Diagnostics.ICosmosQueryLogSink.

When the repository makes calls against Cosmos DB, it captures information about the operation and then outputs it to Microsoft.Extensions.Logging.ILogger and also to ICosmosQueryLogSink. Keen-eyed observers will notice that I'm logging two different ways and wonder why.

Logging to ILogger means that the information ends up getting buried in the log stream for your app. Pretty much everything in .NET is writing Debug, Info, Warning, and Error messages to ILogger. It's great for getting a unified log stream for the app, but it's kind of a pain when you want to do specific performance analysis things. For example, having a nice, structured way to know what your query's timing numbers, RU counts, and SQL are.

By default, the Benday.CosmosDb library turns off the ICosmosQueryLogSink. More precisely, it registers a "no op" implementation of the sink that does nothing. This is what you'll want to use in production.

namespace Benday.CosmosDb.Diagnostics;

public sealed class NoOpCosmosQueryLogSink : ICosmosQueryLogSink
{
    public static readonly NoOpCosmosQueryLogSink Instance = new();

    private NoOpCosmosQueryLogSink() { }

    public void Record(CosmosQueryDiagnostics diagnostics)
    {
    }
}

But when you're in development and you want to know EXACTLY what your application is doing, you can configure your application to log all this data to a file by using the WithQueryLogSink<FileCosmosQueryLogSink>() option on CosmosRegistrationHelper.

var helper = new CosmosRegistrationHelper(
    builder.Services, cosmosConfig);

builder.Services.Configure<CosmosFileLogSinkOptions>(options =>
{
    options.FilePath = "logs/cosmos-queries.ndjson";
    options.QueueCapacity = 100;
});

helper.WithQueryLogSink<FileCosmosQueryLogSink>();

That writes the operations to an NDJSON (Newline Delimited JSON) log file so you can easily access all that query diagnostic info. We'll dig into what the data actually looks like and how to use it later in the chapter, but throughout the examples that follow, when I say "I looked at the sink output," this is what I mean.

LINQ vs. Raw SQL

When I wrote the cocktail app (https://www.drinkymcdrinkface.com), I used a combination of queries that use LINQ and some that use raw Cosmos SQL. Why I did it that way is a mix of preference, expediency, and also sometimes technical limitations. Plus, a bunch of logging features that exist in the Benday.CosmosDb library that I wanted to standardize on.

Preference-wise, when I'm coding C# against Cosmos DB (or also SQL Server) I tend to use LINQ queries — under the surface, that's typically some kind of IQueryable.

For example, here's SearchByTitleAsync — a title search. LINQ.

public async Task<List<CocktailRecipe>> SearchByTitleAsync(
    string searchText, string tenantId)
{
    var context = await GetQueryContextAsync(tenantId);

    var query = context.Queryable
        .Where(x => x.Name.Contains(
            searchText, StringComparison.OrdinalIgnoreCase));

    return await GetResultsAsync(
        query, nameof(SearchByTitleAsync), context.PartitionKey);
}

Clean. Readable. One LINQ Where clause, one call to GetResultsAsync. The library handles the partition key, logs the RU charge, captures the generated SQL. Everything gets taken care of in a standardized way without a whole lot of extra, duplicated code.

But then there's the SearchBySingleIngredientAsync method that finds recipes that contain a specific ingredient. When I originally wrote this, I reached for raw SQL because I needed to get into the nested Ingredients array on each recipe. My thinking was: LINQ can't express JOIN i IN c.ingredients directly, so I need to write the SQL myself. Here's what I wrote:

public async Task<List<CocktailRecipe>> SearchBySingleIngredientAsyncUsingSql(
    string ingredientName, string tenantId)
{
    var partitionKey = GetPartitionKey(tenantId);

    var queryText = @"
        SELECT DISTINCT VALUE c FROM c
        JOIN i IN c.ingredients
        WHERE c.entityType = 'CocktailRecipe'
        AND c.tenantId = @tenantId
        AND CONTAINS(i.name, @ingredientName, true)";

    var queryDef = new QueryDefinition(queryText)
        .WithParameter("@tenantId", tenantId)
        .WithParameter("@ingredientName", ingredientName.Trim());

    var iterator = GetContainer().GetItemQueryIterator<CocktailRecipe>(
        queryDef,
        requestOptions: new QueryRequestOptions { PartitionKey = partitionKey });

    return await GetResultsAsync(iterator, nameof(SearchBySingleIngredientAsync));
}

JOIN i IN c.ingredients iterates the nested ingredients array — that's a Cosmos SQL cross-apply, not a relational JOIN — and DISTINCT VALUE c deduplicates because a recipe might match the ingredient twice. Both queries end up calling GetResultsAsync, which means both get instrumented — RU charge, duration, partition key, cross-partition detection.

I was happy with this. It worked. It was readable. I moved on.

But Wait — LINQ Can Do This Too

But a little later, I started wondering if I could just do a LINQ query against the Ingredients collection. So I went back and tried a LINQ version just to see what would happen. It turns out that LINQ could get to the same result through a different "door": LINQ's Any() method.

public async Task<List<CocktailRecipe>> SearchBySingleIngredientAsyncUsingLinq(
    string ingredientName, string tenantId)
{
    var trimmed = ingredientName.Trim();

    var queryContext = await GetQueryContextAsync(tenantId);

    var query = queryContext.Queryable
        .Where(x => x.Ingredients.Any(
            i => i.Name.Contains(trimmed, StringComparison.OrdinalIgnoreCase)))
        .OrderBy(x => x.Name);

    return await GetResultsAsync(
        query,
        GetQueryDescription(nameof(SearchBySingleIngredientAsyncUsingLinq)),
        queryContext.PartitionKey);
}

That compiles. That runs. And when I looked at the ICosmosQueryLogSink output, here's the SQL it generated:

SELECT VALUE root FROM root
JOIN (
    SELECT VALUE EXISTS(
        SELECT VALUE i0 FROM root
        JOIN i0 IN root["ingredients"]
        WHERE CONTAINS(i0["name"], "bourbon", true)
    )
) AS v0
WHERE v0
ORDER BY root["name"] ASC

That is definitely not what I would have written by hand. It's wrapping an EXISTS inside a JOIN subquery, aliasing the boolean result as v0, and then filtering WHERE v0. To me it looks a little circuitous and it's not really how I probably ever would have thought about doing it but it works.

Both versions return the same 463 results for a "bourbon" search. So I rigged up a head-to-head test — run both versions back-to-back against the same data and compare the diagnostics from the sink:

Version	Generated SQL Strategy	RUs	Duration	Results
LINQ (`Any()`)	EXISTS subquery	41.48	562ms	463
Raw SQL	JOIN + DISTINCT	79.19	414ms	463

The LINQ version costs almost half the RUs. The slightly ugly generated SQL wins.

Why? EXISTS short-circuits. It walks the ingredients array and the moment it finds a match, it stops. The JOIN approach cross-applies every ingredient in every recipe, produces a row for every match, and then DISTINCT has to collapse all those duplicates back down. A recipe with three bourbon ingredients generates three intermediate rows in the JOIN version and one boolean in the EXISTS version.

I only discovered this because I had the query log sink running and could compare the RU charges side by side. The developer instinct — "I'll write the SQL myself, I know what I'm doing" — actually produced a worse query.

But that raises one those really important questions that has been hiding in plain sight the whole time: how did LINQ turn x.Ingredients.Any(...) into that SQL? What's actually going on between the C# LINQ code and the SQL that Cosmos ran?

How LINQ-to-Cosmos Actually Works

Here's where I'm going to nerd out for a minute, because understanding this mechanism is what makes the rest of the chapter click.

In Chapter 7, I did a sidebar on how LINQ works — how it rides on top of IEnumerable<T> and the Iterator Pattern, and how IQueryable<T> lets you define a potential query that doesn't run until you need results. What I didn't get into was why those two interfaces behave so differently, even when the code you write looks identical.

We Need the Func. Gotta Have That Func.

As noted software architect and funk legend George Clinton once wrote, we need the Func and gotta have that Func. He was talking about something else entirely but the principle applies.

These two lines of C# look darned similar:

// Against a List<T> (IEnumerable)
var results = myList.Where(x => x.Name.Contains("bourbon"));

// Against a Cosmos queryable (IQueryable)
var results = context.Queryable.Where(x => x.Name.Contains("bourbon"));

Same Where. Same lambda. Same syntax. But the compiler does something fundamentally different with each one.

For the IEnumerable version, the lambda compiles into a Func<T, bool> — executable code. A function. .NET iterates through your list, calls that function on each item, and keeps the ones where it returns true. The lambda runs on every element in memory. It's just code calling code. (Gotta have that Func<T, bool>.)

For the IQueryable version, the compiler doesn't produce a Func at all. Instead, it produces an Expression<Func<T, bool>> — a data structure that describes the lambda. That x => x.Name.Contains("bourbon") gets turned into a tree of objects: a MethodCallExpression wrapping a LambdaExpression wrapping a MemberExpression (the property access on .Name) wrapping a ConstantExpression (the "bourbon" value). It's objects all the way down.

That's the trick. Same syntax, completely different compilation. IEnumerable gets a function to execute. IQueryable gets a description of a function — an expression tree — that it hands off to a provider for translation. The lambda becomes data instead of code.

The Provider Walks the Tree

The Cosmos LINQ provider — a key part of the Microsoft.Azure.Cosmos SDK — walks that expression tree, node by node, and emits a string of Cosmos SQL. Where(x => x.Name == "Manhattan") becomes SELECT VALUE root FROM root WHERE root["name"] = "Manhattan". The LINQ syntax on the C# side is just the authoring surface. What actually hits the database is always SQL.

That's what happened with the ingredient search. I wrote .Where(x => x.Ingredients.Any(i => i.Name.Contains(...))). The Cosmos LINQ provider's expression tree walker hit the Any() node, recognized it had a handler for Enumerable.Any, and translated it into the EXISTS subquery wrapped in that JOIN ... AS v0 structure. Circuitous-looking SQL, yes — but the provider chose EXISTS because it's the semantically correct translation of "does any element in this collection match?", and that EXISTS is why it outperformed my hand-written JOIN + DISTINCT.

Why does this matter beyond the cocktail app? Because the translation has limits.

That expression tree walker was written by developers at Microsoft. They had to write a handler for every .NET method they wanted to support. String.Contains? Yes, they wrote a translation for that — it maps to Cosmos's CONTAINS function. String.StartsWith? Yes. Enumerable.Any? Yes. Enumerable.All? No. String.IsNullOrEmpty? No. Those throw at runtime. Not at compile time. At runtime. You write what looks like perfectly valid C#, the compiler is happy, and then you run it and the Cosmos SDK says "I don't know how to translate this to SQL" and hands you an exception.

Here's what the All() error actually looks like:

Microsoft.Azure.Cosmos.Linq.DocumentQueryException:
'Method 'All' is not supported.'
   at Microsoft.Azure.Cosmos.Linq.ArrayBuiltinFunctions.Visit(
       MethodCallExpression methodCallExpression,
       TranslationContext context)

You can see exactly where the translation failed: ArrayBuiltinFunctions.Visit. The SDK's expression tree walker hit a MethodCallExpression for All() and simply didn't have a handler for it.

This is why abstractions leak. It's not a bug. It's a fundamental consequence of the design: someone has to write a translation for every operation, and the surface area of .NET is enormous. The Cosmos SDK team can't cover all of it, so they cover the common stuff and leave the rest unimplemented — ya know — just to keep us on our toes. Joking aside, supporting every single possible thing is a lot of work and there are probably zillions of edge cases and piles of things that just simply don't have a good option for translation. Even for Microsoft: there are only so many minutes in the day.

The practical takeaway: if you're writing a LINQ query and it compiles, that does not guarantee that it will run. LINQ-to-Cosmos is a runtime translation, not a compile-time one. The first time you find this out is usually when production throws an exception you've never seen in local testing because you never hit that code path with the right data.

Multi-Ingredient Search: And Then It Gets Weirder

When I saw the single-ingredient results, I decided to try an experiment for another query in the app. The cocktail app also has a multi-ingredient search — "find recipes that contain rum AND bourbon." The raw SQL version builds dynamic EXISTS subqueries, one per ingredient:

SELECT * FROM c
WHERE c.entityType = 'CocktailRecipe' AND c.tenantId = @tenantId
AND EXISTS(SELECT VALUE i FROM i IN c.ingredients
           WHERE CONTAINS(i.name, @ingredient0, true))
AND EXISTS(SELECT VALUE i FROM i IN c.ingredients
           WHERE CONTAINS(i.name, @ingredient1, true))
ORDER BY c.name

Can LINQ do this too? My first instinct was to try LINQ's All() method:

var query = context.Queryable
    .Where(x => x.Ingredients.All(i =>
        cleaned.Any(ing =>
            i.Name.Contains(ing, StringComparison.OrdinalIgnoreCase))));

And there's the speed bump: Method 'All' is not supported. — the exact exception I just described. Any() has a handler in the expression tree walker. All() doesn't.

But you don't need All(). You already know Any() works. And chaining multiple .Where() calls AND-s them together — that's just how LINQ composition works. So the trick is to loop over the search terms and add one .Where() per ingredient:

public async Task<List<CocktailRecipe>> SearchByIngredientsAsyncUsingLinq(
    params string[] ingredientNames)
{
    var cleaned = (ingredientNames ?? Array.Empty<string>())
        .Where(x => string.IsNullOrWhiteSpace(x) == false)
        .Select(x => x.Trim().ToLowerInvariant())
        .Distinct().ToArray();

    if (cleaned.Length == 0)
        return new List<CocktailRecipe>();

    var queryContext = await GetQueryContextAsync(ApiConstants.DefaultTenantId);

    IQueryable<CocktailRecipe> query = queryContext.Queryable;

    foreach (var term in cleaned)
    {
        var captured = term; // capture for closure
        query = query.Where(x =>
            x.Ingredients.Any(i =>
                i.Name.Contains(captured, StringComparison.OrdinalIgnoreCase)));
    }

    query = query.OrderBy(x => x.Name);

    return await GetResultsAsync(
        query,
        GetQueryDescription(nameof(SearchByIngredientsAsyncUsingLinq)),
        queryContext.PartitionKey);
}

Each .Where() adds another EXISTS subquery. Nothing hits the database until GetResultsAsync — you're just stacking filters on the IQueryable. (We'll talk about that "nothing hits the database until" part in the deferred execution section coming up.)

The var captured = term line matters, by the way. Without it, the closure captures the loop variable itself, and by the time the query executes, every EXISTS clause would use whatever term was on the last iteration. Classic C# closure-over-loop-variable gotcha.

So how did the head-to-head come out? "rum AND bourbon":

Version	RUs	Duration	Results
LINQ (chained `Where` → EXISTS)	18.29	93ms	49
Raw SQL (dynamic EXISTS)	18.48	79ms	49

Basically a dead heat. Both approaches ended up generating EXISTS subqueries — they converged on the same strategy. The LINQ version produces WHERE (v0 AND v1) with JOINed boolean subqueries; the raw SQL does AND EXISTS(...) AND EXISTS(...) directly. Different SQL syntax, same execution cost.

So Which One Do You Pick?

There can be performance differences between the SQL approach and the LINQ approach. Sometimes one can be surprisingly faster or use different amounts of RUs. It really depends on how well the "expressionizer" logic is in the SDK provider — the thing that walks your expression tree and emits SQL — does with your particular query scenario. Sometimes Microsoft's translation produces better SQL than what you'd write by hand (the single-ingredient EXISTS story). Sometimes it produces equivalent SQL (the multi-ingredient story). Sometimes it can't translate at all (the All() story).

But then again, there's the "heat death of the universe" problem.

Remember Chapter 7? We spent a whole section talking about how some optimizations matter at a universal scale and are completely irrelevant at the scale most of us mere humans actually operate at. The difference between 41 RUs and 79 RUs matters if you're running that query ten million times a day. But for most apps, both numbers are completely fine and nobody will ever notice.

So sometimes you'll pick one approach over the other not because it's faster, but because writing it the other way ends up yielding incomprehensible code. The chained-WHERE LINQ version for multi-ingredient search is readable. The dynamically-built EXISTS SQL is also readable. But some queries — conditional aggregation, vector distance, projections to different shapes — only exist in raw SQL, and trying to force them through LINQ would produce something nobody could maintain.

My advice: use LINQ when it's clear. Drop to raw SQL when LINQ can't do it or when the LINQ version turns into spaghetti. And either way, look at the sink output so you know what actually ran and how it performs.

A Note on Query Plan Caching

One more thing worth noting about the LINQ vs. raw SQL choice, and this one's a little subtle. The raw SQL versions use parameterized queries (@ingredientName, @ingredient0, @ingredient1). The LINQ-generated SQL inlines the values directly ("bourbon", "rum").

That matters because of how Cosmos handles query plans. Before Cosmos can execute a query, it has to build a query plan — and building that plan requires a network round-trip to the Cosmos DB gateway. Cosmos caches these plans, keyed by the query text string. A parameterized query like CONTAINS(i.name, @ingredientName, true) is the same query text regardless of whether you're searching for "bourbon" or "rye" or "mezcal" — one plan, cached, reused for every search. A LINQ-generated query with inlined values produces a different query text for every search term — CONTAINS(i0["name"], "bourbon", true) is a different cache key than CONTAINS(i0["name"], "rye", true). Each unique search term misses the cache and triggers another gateway round-trip.

At the scale where the RU difference between EXISTS and JOIN+DISTINCT matters — millions of queries a day — the plan-caching difference matters too. At normal scale, probably not. But it's one more thing the sink output can help you watch for.

The LINQ provider could parameterize the generated SQL — emit @p0 instead of "bourbon" and attach the value as a query parameter. The parameters field in the sink output is already there, it's just empty for LINQ queries. EF Core already does exactly this for SQL Server. The Cosmos LINQ provider just... doesn't, at least not yet.

Deferred Execution

Chapter 7 introduced this, but it was a sidebar. Now we're going to live in it for a minute because it drives the whole query-composition pattern in the library.

Deferred execution means: constructing a LINQ query in code doesn't actually cause it to run. Nothing hits the database until something forces execution — ToListAsync(), CountAsync(), FirstOrDefaultAsync(), foreach, anything that actually needs the results.

var context = await GetQueryContextAsync(tenantId);

// This doesn't run a query. This builds a description of one.
var query = context.Queryable
    .Where(x => x.Name.Contains(searchText, StringComparison.OrdinalIgnoreCase))
    .OrderByDescending(x => x.Timestamp);

// THIS runs the query. Right here. Not before.
return await GetResultsAsync(query, nameof(SearchByTitleAsync), context.PartitionKey);

The first three lines create an IQueryable<T>. No RUs consumed. No network call. The query is just a data structure sitting in memory — an expression tree, waiting. GetResultsAsync is the trigger. That's where the expression tree gets walked, the SQL gets generated, the HTTP request goes to Cosmos, and you start paying.

For in-memory IEnumerable — like a List<T> — deferred execution is invisible because the "database" is just RAM and there's a less obvious cost. The Func runs against each element as the iterator pulls them. For IQueryable against a real database, the Expression sits there doing nothing until something forces the provider to translate and execute it. That's what makes composition cheap — you can build a query across multiple methods, add filters conditionally, stack Where clauses, and pay nothing until you pull the trigger.

The multi-ingredient search from earlier is deferred execution in action. That foreach loop that stacks .Where() calls? Each iteration adds a filter to the expression tree. No database call. No RUs. Just building up the description, one Any() clause at a time. The query doesn't fire until GetResultsAsync at the end — and when it does, Cosmos gets one SQL statement with all the EXISTS subqueries already composed.

But here's the flip side: deferred execution also means you need to know exactly where the trigger is. And sometimes it's not where you think.

Two Ways LINQ Goes Sideways

The cocktail app taught me two things about how LINQ can not-entirely-succeed in the context of a library like Benday.CosmosDb. Both are real bugs I hit. Both are the kind of thing where you stare at the code for ten minutes saying "that looks right" before you realize what's happening.

The Library Can't See It

CountAsync() is a LINQ extension method. You call it on an IQueryable<T> and it returns an int. Sounds great. Here's the problem: CountAsync() executes the query itself. It already has everything it needs under the surface so it calls into the Cosmos SDK directly, gets the results, counts them, and returns the number.

The library — Benday.CosmosDb — never sees the execution. It never gets to log the RU charge. It never captures the query text. It never detects whether the query went cross-partition. As far as the library is concerned, nothing happened.

// The library sees this — it goes through GetResultsAsync
var results = await GetResultsAsync(query, "SearchByTitle", partitionKey);

// The library does NOT see this — CountAsync runs the query itself
var count = await query.CountAsync();

Same IQueryable. Same database. One goes through the library's instrumentation pipeline; the other doesn't. This is what I mean by "knowing when your query runs." The query runs. You just don't know about it.

The fix in Benday.CosmosDb is ExecuteScalarAsync — a method on the repository that wraps scalar operations (count, first-or-default, any) and runs them through the same diagnostics pipeline as everything else. But you have to know to use it instead of reaching for the LINQ extension directly.

The SQL Isn't What You'd Necessarily Write

The second issue is subtler. When LINQ does translate your code to SQL, the resulting SQL might not be what you'd write by hand. It's equivalent — it returns the same results — but it's not identical. And sometimes the difference matters.

I hit this with GetByPrimaryAsync when I needed to find ingredient classifications where the Secondary field was either null, empty, or missing entirely. My first attempt used LINQ:

var query = context.Queryable
    .Where(x => string.IsNullOrEmpty(x.Secondary));

This threw at runtime. String.IsNullOrEmpty doesn't have a Cosmos LINQ translation. Fair enough. So I rewrote it:

var query = context.Queryable
    .Where(x => x.Secondary == null || x.Secondary == "");

This worked, but the generated SQL was:

WHERE (root["secondary"] = null OR root["secondary"] = "")

The SQL I would've written by hand was:

WHERE (NOT IS_DEFINED(root["secondary"]) OR root["secondary"] = '')

Those aren't the same thing. root["secondary"] = null matches documents where the field exists and has an explicit JSON null value. NOT IS_DEFINED(root["secondary"]) matches documents where the field doesn't exist at all. In a Cosmos container with documents that evolved over time — some imported early before a field was added, some imported later — you might have both flavors. The LINQ version misses the "field doesn't exist" case.

This didn't cause a production bug for me because all my documents happened to have the field. But it could have. And the only reason I caught it was that I was staring at the generated SQL in the log sink output and comparing it to what I expected.

Inside the Sink: What the Data Looks Like

I introduced ICosmosQueryLogSink at the top of this chapter and I've been referencing the sink output all along. Now let's look at what's actually in the data.

Here's the interface:

public interface ICosmosQueryLogSink
{
    void Record(CosmosQueryDiagnostics diagnostics);
}

One method. Synchronous. Fire-and-forget. The library calls Record on the thread that executed the query and moves on. If the sink throws, the library catches it and logs a warning. A broken sink never prevents a query from completing.

And here's what CosmosQueryDiagnostics carries:

public sealed class CosmosQueryDiagnostics
{
    public CosmosQueryEventKind EventKind { get; init; }
    public DateTimeOffset Timestamp { get; init; }
    public string RepositoryName { get; init; }
    public string QueryDescription { get; init; }
    public string? QueryText { get; init; }
    public IReadOnlyDictionary<string, object?>? Parameters { get; init; }
    public PartitionKey PartitionKey { get; init; }
    public double RequestCharge { get; init; }
    public TimeSpan Duration { get; init; }
    public int ResultCount { get; init; }
    public bool IsCrossPartition { get; init; }
    public string? IndexMetrics { get; init; }
}

Repository name, query description (typically "RepositoryName - MethodName"), the generated SQL text, query parameters, partition key, RU charge, wall-clock duration, result count, whether it was a cross-partition execution, and — when enabled — the raw index utilization metrics from Cosmos. Everything you need to understand what happened, packaged as data, not as a log line.

Event Kinds

The EventKind field distinguishes three types of events:

PointOperation — a single-document operation: save, delete, or point-read. No query text (there's no SQL involved — it's just an ID and a partition key).

FeedResponsePage — one page of results from a multi-page query. Cosmos paginates large result sets, and the library iterates through the pages. Each page gets its own event with that page's RU charge and result count.

QueryTotal — the summary for the completed query. Emitted exactly once per query execution, with the total RU charge, total result count, and total duration across all pages. For scalar operations (CountAsync, FirstOrDefaultAsync via ExecuteScalarAsync), this is the only event emitted — there are no pages.

For most analysis, you filter to QueryTotal events. That gives you one row per query execution with the full cost. The FeedResponsePage events are there if you need to understand pagination behavior or identify queries with unexpectedly many pages.

What the File Output Looks Like

FileCosmosQueryLogSink writes each event as a single line of JSON (NDJSON format). Events go into an in-memory queue from Record() and get written to disk by a background thread, so the library's query-execution path is never blocked on file I/O. If the queue fills up (default capacity is 10,000), new events are dropped rather than blocking — a stuck disk shouldn't grow memory without bound.

WARNING: Don't Use This in Production!

This file-based logging is only for development use. It's not optimized for running under load or on a server. For example, I run almost everything that I do on Azure App Services using the "run from zip" option. I suspect that FileCosmosQueryLogSink would throw endless exceptions in that configuration.

If you think you'd like something that runs in production, drop a feature request in the Benday.CosmosDb repo on GitHub.

Here's a real QueryTotal entry from an actual session:

{
  "eventKind": "QueryTotal",
  "timestamp": "2026-04-20T19:27:57.220Z",
  "repositoryName": "CosmosDbPersonRepository",
  "queryDescription": "CosmosDbPersonRepository - GetAllAsync",
  "queryText": "SELECT VALUE root FROM root ORDER BY root[\"timestamp\"] DESC",
  "parameters": {},
  "partitionKey": "[\"SYSTEM\",\"Person\"]",
  "requestCharge": 1,
  "durationMs": 5.131,
  "resultCount": 1,
  "isCrossPartition": false,
  "indexMetrics": null
}

Look at what you can see: this was a GetAllAsync call on the Person repository. It generated a SELECT VALUE root query ordered by timestamp descending. It ran against the ["SYSTEM","Person"] partition — both levels of the hierarchical partition key specified, so no cross-partition scan. It cost 1 RU, took 5 milliseconds, and returned 1 document.

Compare that to what ILogger gave you: a line that said "Request Charge (GetAllAsync): 1 RUs". Same information, technically. But the structured version is data you can query, sort, filter, and aggregate. Export the file into a spreadsheet, sort by requestCharge descending, and your most expensive queries are right there at the top. Filter by isCrossPartition == true and you've found the queries that need partition key work. Filter by repositoryName and you've isolated a single entity type's behavior.

Point-read entries look slightly different:

{
  "eventKind": "PointOperation",
  "timestamp": "2026-04-20T19:27:57.179Z",
  "repositoryName": "CosmosTenantItemRepository`1",
  "queryDescription": "CosmosTenantItemRepository`1 - SaveAsync",
  "queryText": null,
  "parameters": null,
  "partitionKey": "[null]",
  "requestCharge": 1,
  "durationMs": 9.988,
  "resultCount": 0,
  "isCrossPartition": false,
  "indexMetrics": null
}

No queryText — point operations don't have a SQL query, just an ID lookup. The SaveAsync operation cost 1 RU and took about 10 milliseconds.

Writing Your Own Sink

The interface is intentionally simple — one method, synchronous, fire-and-forget — so you can build whatever you need. Want to push diagnostics into Application Insights as custom events? Write a sink that calls TelemetryClient.TrackEvent(). Want to aggregate in memory and dump a summary at shutdown? Write a sink that accumulates into a ConcurrentDictionary keyed by query description. Want to fail tests if any query exceeds 50 RUs? Write a sink that throws in your test project.

The contract is:

Record is called synchronously from the query thread. Don't do slow I/O inline — buffer and flush from a background worker (like FileCosmosQueryLogSink does).
Exceptions from Record are caught, logged as warnings, and suppressed. A broken sink must not prevent queries from completing.
The library registers NoOpCosmosQueryLogSink by default. You override it with helper.WithQueryLogSink<YourSink>().

Index Metrics: Which Indexes Did My Query Actually Use?

Everything we've looked at so far tells you what ran and what it cost. But there's a question the sink couldn't answer until recently: did the query actually use the indexes I set up?

Back in Chapter 7, we talked about custom indexing policies — how excluding unqueried properties from your index can cut write costs by 80%. The advice was "index what you query on, exclude everything else." Good advice. But how do you verify that your queries are hitting the indexes you think they're hitting? And how do you find out if Cosmos wishes you had an index that you don't?

That's what IndexMetrics is for. The Cosmos SDK has a feature called PopulateIndexMetrics that you can set on QueryRequestOptions. When it's on, Cosmos returns a blob of data with every query response that tells you exactly which indexes were utilized and which indexes it recommends you add. The Benday.CosmosDb library now captures that data and flows it through the sink.

It's off by default — and it should be. Turning on PopulateIndexMetrics has measurable RU and latency overhead. This is a development and investigation tool, not a production setting. You turn it on, run your app through its paces, look at the output, and turn it off.

To enable it, use ConfigureDiagnosticsDefault on CosmosRegistrationHelper:

helper.ConfigureDiagnosticsDefault(options =>
{
    options.CaptureIndexMetrics = true;
});

That turns it on for every entity type. If you only want it for a specific entity — maybe you're investigating one slow query and don't want the overhead everywhere — use ConfigureDiagnostics<TEntity> instead.

Once it's on, the indexMetrics field in the sink output goes from null to a JSON string with two sections: UtilizedIndexes and PotentialIndexes.

The examples in this section come from the Benday.CosmosDb sample app (Person, Note, Comment) rather than the cocktail app. I added this feature to the library after the cocktail app work was done, and the sample app was the faster way to exercise it. The data is different but the diagnostics work exactly the same way — same sink, same NDJSON format, same fields.

Here's a GetAllAsync on the Note repository, ordered by timestamp descending:

{
  "eventKind": "QueryTotal",
  "repositoryName": "CosmosTenantItemRepository`1",
  "queryDescription": "CosmosTenantItemRepository`1 - GetAllAsync",
  "queryText": "SELECT VALUE root FROM root ORDER BY root[\"timestamp\"] DESC",
  "partitionKey": "[\"SYSTEM\",\"Note\"]",
  "requestCharge": 2.92,
  "isCrossPartition": false,
  "indexMetrics": {
    "UtilizedIndexes": {
      "SingleIndexes": [
        { "IndexSpec": "/timestamp/?" },
        { "IndexSpec": "/tenantId/?" },
        { "IndexSpec": "/entityType/?" }
      ],
      "CompositeIndexes": []
    },
    "PotentialIndexes": {
      "SingleIndexes": [],
      "CompositeIndexes": [
        {
          "IndexSpecs": ["/tenantId ASC", "/entityType ASC", "/timestamp ASC"],
          "IndexImpactScore": "High"
        }
      ]
    }
  }
}

(I've pretty-printed the indexMetrics value here for readability. In the actual NDJSON file, it's a JSON string on one line.)

Read this from top to bottom. UtilizedIndexes tells you the query used three single-property indexes: /timestamp, /tenantId, and /entityType. That's Cosmos saying "I had indexes on these properties and I used them." Good.

PotentialIndexes is where it gets interesting. Cosmos is saying: "You don't have a composite index on /tenantId ASC, /entityType ASC, /timestamp ASC, and if you added one, it would have a High impact on this query's performance." A composite index lets Cosmos satisfy the filter and the sort in a single index scan instead of using three separate indexes and combining the results. For a query that runs frequently, that's worth paying attention to.

Now compare that to a GetAllByParentIdAsync on the Comment repository from the same session:

{
  "UtilizedIndexes": {
    "SingleIndexes": [
      { "IndexSpec": "/tenantId/?" },
      { "IndexSpec": "/parentId/?" },
      { "IndexSpec": "/entityType/?" },
      { "IndexSpec": "/parentEntityType/?" }
    ],
    "CompositeIndexes": []
  },
  "PotentialIndexes": {
    "SingleIndexes": [],
    "CompositeIndexes": []
  }
}

PotentialIndexes is empty. Cosmos used the indexes it had and didn't wish for anything else. That's the "you got it right" signal.

So the workflow is: turn on CaptureIndexMetrics, exercise your application through its normal query patterns, look at the sink output, and pay attention to any PotentialIndexes entries with "High" impact scores. Those are Cosmos telling you where you're leaving performance on the table. Fix the ones that matter, turn off CaptureIndexMetrics, and move on.

When to Drop to Raw SQL

Earlier in this chapter I thought I had a clean list of "use raw SQL when..." rules. Then I ran the head-to-head comparisons and the list got shorter. Turns out LINQ can do more than I would have initially guessed — Any() handles nested array queries, and chaining .Where() handles dynamic composition. But there are still things that only raw SQL can do.

You need VectorDistance. The vector similarity search uses Cosmos's VectorDistance() SQL function. No LINQ mapping exists. This is always raw SQL. (We'll see this in Chapter 12.)

You need conditional aggregation. GetEmbeddingStatsAsync uses SUM(c.embedding != null ? 1 : 0) — the standard Cosmos idiom for conditional counting. LINQ can't express a ternary inside a SUM.

You need NOT IS_DEFINED. As we saw earlier, LINQ's == null check and Cosmos SQL's NOT IS_DEFINED are different things. If you need to distinguish between "field is null" and "field doesn't exist," you need raw SQL.

You need projections to a different shape. If the query returns something other than the repository's entity type — a scalar, a custom projection, a count — raw SQL with QueryDefinition is usually cleaner than fighting with LINQ's Select.

The LINQ version becomes incomprehensible. Sometimes you can express something in LINQ, but the result is so convoluted that nobody — including you in six months — will be able to read it. Readability counts. Additionally, if that LINQ starts looking convoluted, think about what is going to happen when that expression tree gets converted into SQL. It might work but end up creating something that is completely bonkers once it turns into SQL.

When none of those apply — simple filters, ordering, nested array searches with Any(), even dynamic multi-condition composition — LINQ is the better default. It's more readable, it stays inside the library's GetQueryContextAsync / GetResultsAsync pipeline automatically, and as we saw, sometimes the SDK's translation actually outperforms hand-written SQL.

The Practical War Stories

Let me close with two specific bugs from the cocktail app that I think are worth knowing about, because they're the kind of thing that'll bite you if you're not looking for them.

The `GetCountAsync` Story

I needed a simple count of how many recipes existed. Should be easy, right? I wrote something like this:

var context = await GetQueryContextAsync(tenantId);
var count = await context.Queryable.CountAsync();

This worked. But as we discussed, it bypasses the library's instrumentation — CountAsync() is a LINQ extension that runs the query directly. So I switched to ExecuteScalarAsync, which routes the execution through the library's pipeline:

var context = await GetQueryContextAsync(tenantId);
var count = await ExecuteScalarAsync(
    context.Queryable, q => q.CountAsync(),
    nameof(GetCountAsync), context.PartitionKey);

And it blew up. Null reference exception inside ToQueryDefinition(). Not from my code — from deep inside the Cosmos SDK.

The issue: ExecuteScalarAsync internally calls ToQueryDefinition() to capture the SQL text for diagnostics. But CountAsync() on a bare queryable — one with no Where clause, no OrderBy, nothing composed on top — produces a queryable that ToQueryDefinition() doesn't know how to handle. It's a degenerate case: a queryable that's too simple.

The fix was to add a trivially-true Where clause to make the queryable non-trivial:

var context = await GetQueryContextAsync(tenantId);
var count = await ExecuteScalarAsync(
    context.Queryable.Where(x => x.EntityType == context.Queryable.First().EntityType),
    q => q.CountAsync(),
    nameof(GetCountAsync), context.PartitionKey);

Ugly? A little. But it works, it goes through the library's diagnostic pipeline, and the Where clause doesn't change the result because the partition key already scopes to the right entity type. The lesson: when the SDK's internal plumbing expects a "real" query and you hand it a bare queryable, things break in surprising ways.

The `GetEmbeddingStatsAsync` Story

I wanted to know: how many recipes have embeddings, and how many don't? This is useful during the batch embedding process — run the stats, see how many are left, decide whether to kick off another batch.

The query I needed in SQL was:

SELECT
    COUNT(1) AS totalCount,
    SUM(c.embedding != null ? 1 : 0) AS withEmbedding,
    SUM(c.embedding = null ? 1 : 0) AS withoutEmbedding
FROM c
WHERE c.entityType = 'CocktailRecipe'
AND c.tenantId = 'COCKTAILS'

That SUM(c.embedding != null ? 1 : 0) is the standard Cosmos idiom for conditional aggregation. It's the equivalent of SQL Server's SUM(CASE WHEN ... THEN 1 ELSE 0 END). LINQ can't express this — there's no Sum overload that takes a ternary conditional. This is the kind of query that LINQ will never be able to do because the expressiveness gap between "what .NET methods exist" and "what Cosmos SQL can do" is just too wide.

Raw SQL. Parameterized. Through the library's diagnostics. Done.

What You Now Know

Three chapters ago, you knew how Cosmos DB is structured. Two chapters ago, you knew what queries cost. One chapter ago, you knew how the cocktail app was designed. Now you know what's actually running underneath.

LINQ is great until it isn't. Raw SQL is ugly until you need it. And sometimes the LINQ version that generates weird-looking SQL actually outperforms the hand-crafted version you were so proud of. The Cosmos LINQ provider is doing heroic work translating expression trees to SQL, but it can't translate everything, and the things it can translate don't always map one-to-one. Deferred execution is what makes query composition cheap, but it also means you have to track when the query fires — because if it fires outside the library's pipeline, nobody's watching.

ICosmosQueryLogSink is the answer to "what happened?" It captures every query as structured data — method name, SQL text, parameters, partition key, RU charge, duration, cross-partition flag, and index utilization — and gives you a single place to look when something is slow, expensive, or wrong. Without it, I'd never have discovered that my "better" hand-written SQL was costing twice the RUs.

Next up: Chapter 12, where we add vector search to the same container. The cocktail app gets natural-language search — "something dark and smoky with mezcal" — and the whole AI integration costs about a penny. Same partition layout. Same query patterns. One new concept: embeddings. That's the chapter you demo at conferences.

Categories: cosmos-db-for-dotnet-developers

Tags: cosmosdb azure csharp linq query-performance observability

Azure Cosmos DB for .NET Developers book cover

Cosmos DB for .NET Developers: The Book