'Using SelectMany on IQueryable vs IEnumerable
I wanted to do a little experiment to see the cost of operations if I process the data at CosmosDB and return it, vs. if I read the whole document and process it in-memory.
Suppose you have a collection of States in CosmosDB. Each State is one document. The State has some info like Capital, Chief Minister and a list of cities. One state will have multiple cities under it. The requirement is to fetch the list of all cities in our collection.
Scenario 1
Created an IQueryable which reads all documents from container, applies SelectMany on the queryable and feeds the filtered response to FeedIterator:
var states = container.GetItemLinqQueryable<State>();
var iterator = states.SelectMany(s => s.Cities).ToFeedIterator();
while (iterator.HasMoreResults)
{
var cityList = await iterator.ReadNextAsync();
RUs += cityList.RequestCharge;
foreach (var city in cityList)
{
response.Add(city);
}
}
Result:
Scenario 2
Read all state documents from collection and pulled them in-memory. Applied SelectMany for cities in-memory.
public async Task<List<Workload>> GetAllItemsAsync()
{
var response = new List<State>();
var iterator = container.GetItemLinqQueryable<State>().ToFeedIterator();
while (iterator.HasMoreResults)
{
var states = await iterator.ReadNextAsync();
RUs += states.RequestCharge;
foreach (var state in states)
{
response.Add(state);
}
}
return response;
}
var states = await GetAllItemsAsync();
var cities = states.SelectMany(w => w.Cities).ToList();
Result:
This, understandably consumes much fewer RUs because there's no processing to be done by CosmosDB.
In this scenario, is it better to perform SelectMany on the IEnumerable instead of having CosmosDB do the processing? Is the drawback of not fetching all the documents to avoid the unnecessary data transfer?
Scenario 3
I tried another method, where I fetched the list of cities from each document and then merged them -
var cities = container.GetItemLinqQueryable<State>()
.Select(w => w.Cities);
var iterator = cities.ToFeedIterator();
while (iterator.HasMoreResults)
{
var citiesList = await iterator.ReadNextAsync();
RUs += scenarioList.RequestCharge;
foreach (var cities in citiesList)
{
response.AddRange(cities);
}
}
Result:
This operation consumed the most RUs. Can someone explain this?
I would assume that SelectMany would require more processing that simply selecting the value of the property from each document.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|



