Turbocharging MCP: Speed, Smarts, and Scale by Viraj Sharma

ScyllaDB 1 views 16 slides Oct 16, 2025

Slide 1 of 16

About This Presentation

Learn how to speed up Model Context Protocol (MCP) tools using async servers, caching, batching, and smart data handling—making your AI tool calls faster, smoother, and more efficient.

Size: 1.41 MB

Language: en

Added: Oct 16, 2025

Slides: 16 pages

Slide Content

A ScyllaDB Community
Turbocharging MCP:
Speed, Smarts, and Scale
Viraj Sharma
Student

Viraj Sharma he/him

Student at Presidium School Indirapuram, Delhi, India
■Spoke about MCP at python conferences (Delhi, Lithuania,
LinuxFest)
■Awesome gathering of performance experts
■I have interest in Indian mythology, science and poetry
■I am usually coding or brainstorming ideas with my father

Primer: Model Context Protocol

Primer: What is MCP - Anthropic
Standard way for language models
to interact with tools, APIs, and live
data.
■Core Mechanism:
●Based on JSON-RPC 2.0
■How It Works:
●Model → MCP → Tool → Response
■Why it matters
●Brings live, external capabilities
directly into model conversations.

Why Performance Matters in MCP
Performance = Better and More
accurate LLM user Experience
■Real-Time Needs:
●Users expect instant responses
■Impact of Slowness:
●Increases server costs &
infrastructure strain.
■Goal
●Make MCP interactions faster, lighter,
more reliable without major code
changes

MCP Performance Strategies

Elicitation-Driven Filtering
Additional ﬁltering before execution
using elicitation

■Additional ﬁltering can be applied
before execution

■The amount of data transferred
can be reduced

■Response times can be improved

{
"type": "elicitation",
"parameters": [
{ "name": "dateRange", "value":
"2025-01-01 to 2025-01-31" },
{ "name": "region", "value": "APAC" },
{ "name": "metrics", "value": ["sales",
"conversionRate"] }
]
}

Smart Tool Discovery & Updates
Data boundary and parameters can be
revealed during tool discovery phase
itself
■The data boundaries can be revealed
upfront - Avoids Null or Empty
Responses
■The data ranges can be provided
■The discovery can support partial
updates

{
"type": "tool/discovery",
"tool": "DataAnalyzer",
"metadata": {
"data_boundaries": {
"start": "2025-01-01T00:00:00Z",
"end": "2025-08-12T00:00:00Z"
},
"data_ranges": [
{ "min": 1000, "max": 5000 }
]
}

Predeﬁned Prompt Optimization
Using the prompts feature - predeﬁned
optimized queries, urls can be setup

■Uses consistent query patterns

■Triggers correct processing paths

■Avoids redundant interpretation steps

{
"type": "request",
"method": "prompt",
"params": {
"message": "Summarize the latest
performance metrics",
"maxTokens": 200,
"temperature": 0.7
},
"id": 42
}

Resource-Aware Context Selection
Use resources and resource templates to
expose

■Only needed parts of a resource are
fetched (partial retrieval)

■Subscriptions to reactively load
resources instead of polling

■Lazy loading reduces memory
footprint until data is actually
required

"resources": [
{
"uri": "file:///project/src/main.rs",
"name": "main.rs",
"description": "Primary application
entry point",
"mimeType": "text/x-rust"
}
],

Clean Lifecycle Shutdown
A graceful termination where the client
sends a shutdown request

■Connections can be closed
gracefully

■Memory and CPU can be freed

■Performance degradation can be
avoided by leaking connection
handles, http resources

{
"type": "notification",
"method": "shutdownInitiated",
"params": {
"reason": "maintenance_window",
"timestamp": "2025-08-12T08:10:00Z"
}
}

Chunked Data via Streamable HTTP
Using Chunked data in response with
streamableHttp feature

■The data can be sent as segments

■Memory usage spikes can be reduced

■Perceived response times can be
lowered

HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked

4
{"pa
6
rt": 1}
5
{"mor
7
e": "da"}

Cancellation-Driven Resource Reclamation
In ﬂight data access/user operation
cancellation

■Connections can be closed gracefully

■Memory and CPU can be freed

■Performance degradation can be
avoided due to excessive wasteful
workload on data sources

{
"jsonrpc": "2.0",
"method": "notifications/cancelled",
"params": {
"requestId": "123",
"reason": "User requested cancellation"
}
}

Metric Statistic
SDK Downloads (weekly) > 8 million
MCP Servers > 10,000
MCP Tools Available 527
New Servers Added (Weekly) ~800
Major Adopters OpenAI, Replit, DeepMind, etc.
MCP Ecosystem at a glance

Thank you! Let’s connect.
Viraj Sharma
[email protected]
@virajsharma2000
sharmaviraj.com

Turbocharging MCP: Speed, Smarts, and Scale by Viraj Sharma

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Turbocharging MCP: Speed, Smarts, and Scale by Viraj Sharma

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......