Avoiding In-Memory Aggregation and Overfetching in MongoDB with Prisma
How we stopped loading entire collections into Node.js and let the database do its job.
Most backend engineers learn early about the N+1 query problem.
One query loads a list. Then N additional queries fetch related data.
It's inefficient and easy to detect in query logs.
But in many MongoDB applications using Prisma, a different anti-pattern is often more damaging and more common:
Overfetching combined with in-memory aggregation.
Instead of letting the database aggregate data, the application loads thousands of records into Node.js and processes them with JavaScript.
At small scale it works fine.
At a few thousand records it becomes a bottleneck.
The pattern usually looks like this:
const orders = await prisma.order.findMany({ where: { eventId } })
const confirmed = orders.filter(o => o.status === "CONFIRMED").length
const pending = orders.filter(o => o.status === "PENDING").length
This means:
- MongoDB sends every document over the network
- Node.js loads them into memory
- JavaScript loops over them multiple times
All to compute something the database can do instantly.
The Problem: A Real Dashboard
In our event management platform built with:
- Next.js 15
- tRPC
- Prisma 6
- MongoDB
The admin dashboard shows metrics like:
- total registrations
- pending payments
- check-ins
- revenue breakdown
The original implementation looked like this:
// ❌ BEFORE — loads ALL orders into memory
const allOrders = await ctx.db.order.findMany({
where: { eventId: input.eventId },
include: { ticketType: true, coupon: true },
})
const totalConfirmed = allOrders.filter(o => o.status === "CONFIRMED").length
const totalPending = allOrders.filter(o => o.status === "PENDING").length
const totalCancelled = allOrders.filter(o => o.status === "CANCELLED").length
const totalRefunded = allOrders.filter(o => o.status === "REFUNDED").length
let revenue = 0
for (const order of allOrders) {
if (order.status === "CONFIRMED") {
revenue += order.organizerReceivedCents
}
}
This worked fine during development.
But once events reached thousands of orders, the dashboard became slow.
Why This Is Expensive
1. Memory pressure
An event with 10,000 orders loads 10,000 documents into Node.js memory.
Most of those fields are never used.
2. Network overhead
MongoDB sends documents serialized as BSON.
Even if each document is ~1 KB:
10,000 orders ≈ 10 MB transferred
All that data travels over the network just to compute a few counts.
3. CPU usage in Node.js
Each .filter() scans the entire array.
orders.filter(...)orders.filter(...)orders.filter(...)
That means multiple full passes over the dataset.
4. Poor index utilization
MongoDB indexes are optimized for query filtering and counting.
But when you load every document and aggregate in JavaScript, the database cannot leverage those indexes effectively.
Why This Happens More Often With MongoDB
MongoDB does support aggregation pipelines, including $group.
Example:
db.orders.aggregate([
{ $match: { eventId } },
{
$group: {
_id: "$status",
total: { $sum: 1 }
}
}
])
However, Prisma's MongoDB adapter does not expose a full groupBy API like the SQL connectors.
Because of this limitation, developers often default to:
findMany() → aggregate in JavaScript
But Prisma does support count() queries, which MongoDB can execute very efficiently — especially with indexes.
The Solution: Database-Level Counting
Instead of loading all documents, let the database count them.
// ✅ AFTER — parallel count queries
const [
totalConfirmed,
totalPending,
totalCancelled,
totalRefunded,
] = await Promise.all([
ctx.db.order.count({
where: { eventId: input.eventId, status: "CONFIRMED" },
}),
ctx.db.order.count({
where: { eventId: input.eventId, status: "PENDING" },
}),
ctx.db.order.count({
where: { eventId: input.eventId, status: "CANCELLED" },
}),
ctx.db.order.count({
where: { eventId: input.eventId, status: "REFUNDED" },
}),
])
Each query returns a single integer.
Instead of transferring megabytes of documents, the response is just a few bytes.
With Promise.all, the queries run concurrently.
Why This Is Faster
With a compound index like:
[eventId, status]
MongoDB can satisfy the query directly from the index, without scanning the full collection.
This dramatically reduces:
- disk access
- CPU usage
- network payload
Adding the Right Indexes
Indexes must match the query pattern.
Example Prisma schema:
model Order {
id String @id @default(auto()) @map("_id") @db.ObjectId
eventId String @db.ObjectId @map("event_id")
status String
createdAt DateTime @default(now())
@@index([eventId, status])
@@index([status, createdAt])
@@index([createdAt])
}
Apply with:
npx prisma db push
MongoDB uses db push instead of migrations.
Pattern: Filtering in the Database
Another common anti-pattern is filtering in JavaScript.
❌ Bad
const orders = await prisma.order.findMany()
const confirmed = orders.filter(o => o.status === "CONFIRMED")
✅ Better
const confirmed = await prisma.order.findMany({
where: {
status: "CONFIRMED"
}
})
This allows MongoDB to use indexes.
Pattern: Pagination in the Database
Loading everything and paginating in memory is another performance killer.
❌ Bad
const orders = await prisma.order.findMany()
const page = orders.slice(skip, skip + take)
✅ Better
const orders = await prisma.order.findMany({
skip,
take,
orderBy: { createdAt: "desc" }
})
MongoDB now scans only the required range.
When findMany Is Still Necessary
Sometimes you do need real data, not just counts.
Example: revenue calculations.
The key is minimizing what you load.
const orders = await prisma.order.findMany({
where: {
eventId,
status: { in: ["CONFIRMED", "PENDING"] }
},
select: {
status: true,
organizerReceivedCents: true
}
})
let revenue = 0
for (const order of orders) {
if (order.status === "CONFIRMED") {
revenue += order.organizerReceivedCents
}
}
Key improvements:
- filter unwanted statuses in the database
- use
selectinstead ofinclude - aggregate in a single pass
Beyond Query Optimization: Precomputed Stats
At larger scale, even efficient count() queries can become expensive when dashboards are loaded frequently.
Many high-scale systems solve this using precomputed counters.
Example document:
{
"eventId": "evt_123",
"confirmedOrders": 182,
"pendingOrders": 21,
"revenue": 21840
}
Whenever an order changes status:
- order.confirmed → increment confirmedOrders
- order.cancelled → decrement confirmedOrders
This approach allows dashboards to load with a single read.
Quick Checklist
Before shipping any query, ask yourself:
- Am I loading records just to count them?
- Am I filtering in JavaScript?
- Am I paginating with
.slice()? - Am I using
includewhenselectwould suffice? - Do my indexes match my query patterns?
- Can multiple queries run in parallel with
Promise.all?
Results
After applying these patterns:
| Metric | Improvement |
|---|---|
| Dashboard load time | ~2s → ~200ms |
| Memory usage | -60% |
| Database scans | eliminated |
Overfetching rarely appears as an obvious bug.
Your logs show a single query.
Your database looks healthy.
Meanwhile, Node.js is quietly doing the database's job —
one .filter() at a time.
The fix is simple:
Let the database do what it was built for.