AI Agentic Systems and Model Context Protocol (MCP): Architecture, Workflows, and Examples
#ai#agents#MCP#LLM#tools#architecture#devops
AI agentic systems use language models plus tools and a control loop to reason and act in the world. The Model Context Protocol (MCP) is a standard way for AI applications to connect to external tools and data. This document covers both: agentic architecture and workflows (ReAct, plan-and-execute), then MCP’s architecture, data flow, and practical examples.
Part 1: AI Agentic Systems
What is agentic AI?
Agentic AI refers to systems where an LLM (or other model) drives a loop: it receives context (user input, history, tool results), reasons about what to do, chooses actions (often tool calls), and observes results before deciding the next step. The agent continues until it reaches a final answer or a stopping condition.
Key traits:
- Autonomy: The model decides which actions to take, not a fixed script.
- Tool use: Actions are often calls to external tools (APIs, code, search, databases).
- Loop: Reason → Act → Observe → repeat.
- State: Conversation history and tool results form the state the model conditions on.
High-level architecture
+------------------+
| User / Task |
| (goal, query) |
+--------+---------+
|
v
+----------------------------------------------------------------------+
| AGENTIC CONTROL LOOP |
| +-------------+ +-------------+ +-------------+ |
| | Memory | | LLM | | Tool | |
| | (history, |<-->| (reason + |--->| Registry | |
| | context) | | decide) | | (tools) | |
| +-------------+ +-------------+ +-------------+ |
| | | |
| | tool call | execute |
| v v |
| +-------------+ +-------------+ |
| | Observation |<---| Tool / Env | |
| | (result) | | (API, code)| |
| +-------------+ +-------------+ |
| | |
| +---------> back to LLM (next turn) |
+----------------------------------------------------------------------+
| Component | Role |
|---|---|
| LLM | Receives prompt (task + history + tool results), outputs reasoning and/or tool calls. |
| Memory | Stores conversation and tool outputs so the agent has context across turns. |
| Tool registry | List of callable tools with names, descriptions, and input schemas. |
| Tool execution | Runs the chosen tool (API, code, search) and returns the result. |
| Observation | Result of the tool call (or environment) fed back into the next LLM turn. |
Agentic workflows
ReAct (Reason + Act)
ReAct interleaves reasoning (thought) and action (tool use). Each step is either a Thought, an Action, or an Observation. The loop continues until the model outputs a final answer.
Workflow:
- Thought: The model reasons in natural language (e.g. “I need to get the current weather for Paris”).
- Action: The model outputs a structured tool call (e.g.
weather(location="Paris")). - Observation: The environment returns the tool result (e.g. “Paris: 18°C, cloudy”).
- Repeat from step 1 until the model decides it can answer the user.
ReAct loop (one turn)
+------------------------------------------------------------+
| Turn N |
| ---------------------------------------------------------- |
| Thought: "User asked for weather in Paris. I'll call |
| the weather tool." |
| Action: weather_current(location="Paris", units="metric") |
| Observation: "Paris: 18°C, cloudy, wind 12 km/h" |
+------------------------------------------------------------+
|
v
+------------------------------------------------------------+
| Turn N+1 |
| Thought: "I have the data. I can summarize for the user." |
| Answer: "In Paris it's currently 18°C and cloudy." |
| [FINISH] |
+------------------------------------------------------------+
When to use ReAct: Exploratory tasks, when the next step depends on previous results, or when you need to retry or branch (e.g. search then read, then maybe search again).
Plan-and-Execute
In plan-and-execute, the model first produces a plan (ordered list of steps), then a separate execution phase runs those steps (often with an LLM or code per step). Planning and execution are decoupled.
Workflow:
- Plan: Given the user goal, the planner LLM outputs a step-by-step plan (e.g. “1. Search for X, 2. Read top result, 3. Summarize”).
- Execute: For each step, the executor runs the corresponding action (tool or sub-task) and records the result.
- Finish: When all steps are done, combine results into a final answer (optionally with a final LLM pass).
+----------------+ +----------------------------------------+
| User goal | | Planner (LLM) |
| "Compare |---->| Plan: |
| product A | | 1. search("product A specs") |
| and B" | | 2. search("product B specs") |
| | | 3. compare(specs_A, specs_B) |
+----------------+ | 4. format comparison for user |
+------------------+---------------------+
|
v
+----------------+ +----------------------------------------+
| Final answer | | Executor (per step) |
| "Product A |<----| Step 1 -> tool -> result_1 |
| has X, B | | Step 2 -> tool -> result_2 |
| has Y..." | | Step 3 -> compare -> result_3 |
+----------------+ | Step 4 -> LLM(result_3) -> answer |
+----------------------------------------+
When to use plan-and-execute: More predictable, multi-step workflows; can use a smaller/cheaper model for planning and a stronger one for execution or summarization.
ReAct vs plan-and-execute
| Aspect | ReAct | Plan-and-Execute |
|---|---|---|
| Flow | Interleave thought and action every turn. | Plan once, then execute steps. |
| Adaptivity | Can change plan after each observation. | Plan is fixed unless you re-plan. |
| Cost | More LLM calls (one or more per step). | Fewer planner calls; executor can be cheaper. |
| Best for | Exploratory, dynamic, or ambiguous tasks. | Structured, predictable pipelines. |
Agentic example: ReAct-style loop
Simplified flow for a “search and summarize” agent:
User: "What's the latest version of Kubernetes?"
Step 1 (Thought): I need to search for the latest Kubernetes version.
Step 1 (Action): search(query="Kubernetes latest version 2024")
Step 1 (Observation): "Kubernetes 1.31 released 2024-08-..."
Step 2 (Thought): I found it. I'll give a short answer.
Step 2 (Answer): "The latest stable version is Kubernetes 1.31 (released August 2024)."
[HALT]
Pseudocode for the control loop:
function runAgent(userQuery):
messages = [systemPrompt, userMessage(userQuery)]
while true:
response = llm(messages)
if response has "Answer:" or "Final answer:":
return extractAnswer(response)
if response has tool call (name, args):
result = executeTool(name, args)
messages.append(assistantMessage(response))
messages.append(userMessage("Observation: " + result))
else:
messages.append(assistantMessage(response))
The system prompt typically includes: (1) description of the ReAct format (Thought / Action / Observation / Answer), (2) list of available tools with names and schemas, (3) rules (e.g. one tool call per turn, always reason first).
Part 2: Model Context Protocol (MCP)
What is MCP?
The Model Context Protocol (MCP) is a standard protocol for exchanging context between AI applications and external servers. It defines how tools, resources, and prompts are discovered, read, and invoked over a client–server connection, so that LLMs can use external capabilities in a uniform way.
Goals:
- Standard interface: Same protocol for filesystem, databases, APIs, custom tools.
- Composability: One host (e.g. an IDE or chatbot) connects to many MCP servers.
- Separation: Servers are independent; the host aggregates their tools and context.
MCP architecture: participants
MCP uses three roles:
| Participant | Role |
|---|---|
| MCP Host | The AI application (e.g. Claude Desktop, VS Code, your app). Coordinates clients and uses aggregated context/tools with the LLM. |
| MCP Client | Created by the host; maintains one connection to one MCP server. Discovers and invokes that server’s primitives. |
| MCP Server | Program that exposes tools, resources, and/or prompts. Can run locally (e.g. STDIO) or remotely (e.g. HTTP). |
One host has many clients; each client talks to one server. A remote server can serve many clients; a local STDIO server usually serves one client.
+------------------------------------------+
| MCP HOST (AI app) |
| (Claude Desktop, VS Code, your agent) |
| |
| +-----------+ +-----------+ +----------+|
| | MCP | | MCP | | MCP ||
| | Client 1 | | Client 2 | | Client 3 ||
| +-----+-----+ +-----+-----+ +----+-----+|
+--------|------------|------------|-------+
| | |
dedicated | | | dedicated
connection | | | connection
v v v
+-------------+ +------------+ +-------------+
| MCP | | MCP | | MCP |
| Server A | | Server B | | Server C |
| (filesystem)| |(database) | | (remote API)|
| local | | local | | remote |
+-------------+ +------------+ +-------------+
MCP layers
MCP has two layers:
-
Transport layer
How bytes are sent: STDIO (local process) or Streamable HTTP (remote). Handles connection, framing, and auth. -
Data layer
JSON-RPC 2.0 over that transport. Defines lifecycle (initialize, capabilities), primitives (tools, resources, prompts), and notifications (e.g. tools/list_changed).
+----------------------------------------------------------+
| DATA LAYER (JSON-RPC 2.0) |
| - initialize / capabilities |
| - tools/list, tools/call |
| - resources/list, resources/read |
| - prompts/list, prompts/get |
| - notifications (e.g. tools/list_changed) |
+----------------------------------------------------------+
+----------------------------------------------------------+
| TRANSPORT LAYER |
| - STDIO (local process) |
| - Streamable HTTP (remote, auth, SSE) |
+----------------------------------------------------------+
MCP primitives (what servers expose)
| Primitive | Purpose | Typical use |
|---|---|---|
| Tools | Callable functions the model can invoke. | Run code, call APIs, query DB, run CLI. |
| Resources | Read-only or dynamic data (files, DB views, API responses). | Inject file contents, schema, config into context. |
| Prompts | Reusable prompt templates (e.g. system prompts, few-shot). | Standardize how the host prompts the LLM. |
Clients discover via */list and use via */get or tools/call. Servers can send notifications (e.g. “my tool list changed”) so the host refreshes.
MCP workflow: lifecycle and tool use
1. Initialization (lifecycle)
Client connects and sends initialize with protocol version and capabilities; server responds with its capabilities (e.g. tools, resources). Then client sends notifications/initialized. After that, the session is ready.
Client Server
| |
| initialize(protocolVersion, |
| capabilities, clientInfo) |
|--------------------------------->|
| result(serverInfo, capabilities)|
|<---------------------------------|
| notifications/initialized |
|--------------------------------->|
| (session ready) |
2. Tool discovery and execution
Host asks each server what tools it has, then the LLM can call them.
Discover: Client sends tools/list; server returns a list of tools with name, description, and inputSchema (JSON Schema).
Execute: When the model chooses a tool, the host sends tools/call with name and arguments. Server runs the tool and returns content (e.g. text or structured data).
Host / LLM MCP Client MCP Server
| | |
| "User wants weather" | |
|------------------------>| |
| | tools/list |
| |------------------------>|
| | { tools: [ {...} ] } |
| |<------------------------|
| LLM picks tool + args | |
|<------------------------| |
| | tools/call(name, args) |
| |------------------------>|
| | result(content) |
| |<------------------------|
| observation for next | |
| LLM turn | |
|<------------------------| |
3. Notifications
Server can push notifications/tools/list_changed (and similar for other primitives). The client then re-fetches tools/list so the host’s tool registry stays up to date.
MCP data layer example (JSON-RPC)
Initialize (client → server)
{
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2024-11-05",
"capabilities": { "elicitation": {} },
"clientInfo": { "name": "my-ai-app", "version": "1.0.0" }
}
}
Server response (capabilities)
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"protocolVersion": "2024-11-05",
"capabilities": {
"tools": { "listChanged": true },
"resources": {}
},
"serverInfo": { "name": "weather-server", "version": "1.0.0" }
}
}
tools/list (discovery)
Request:
{ "jsonrpc": "2.0", "id": 2, "method": "tools/list" }
Response (simplified):
{
"jsonrpc": "2.0",
"id": 2,
"result": {
"tools": [
{
"name": "weather_current",
"description": "Get current weather for a location",
"inputSchema": {
"type": "object",
"properties": {
"location": { "type": "string", "description": "City or coordinates" },
"units": { "type": "string", "enum": ["metric", "imperial"], "default": "metric" }
},
"required": ["location"]
}
}
]
}
}
tools/call (execution)
Request:
{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "weather_current",
"arguments": { "location": "Singapore", "units": "metric" }
}
}
Response:
{
"jsonrpc": "2.0",
"id": 3,
"result": {
"content": [
{ "type": "text", "text": "Singapore: 31°C, humidity 78%, partly cloudy." }
]
}
}
Putting it together: agentic + MCP
In a typical setup:
- MCP host = your agentic application (or IDE with an AI that can use tools).
- MCP servers = provide the tools (and optionally resources/prompts) the agent needs (search, weather, DB, filesystem).
- Agent loop = ReAct or plan-and-execute: LLM sees user message + history + tool results, outputs thought and/or tool call; host maps tool call to tools/call on the right MCP client, gets back observation, and feeds it into the next turn.
User query
|
v
+---------+ tools/list (at start) +-------------+
| Agent |<----from all MCP clients----->| MCP Server 1|
| (LLM + | | MCP Server 2|
| loop) | tools/call(name, args) | ... |
| |------------------------------>| |
| |<-- result as observation -----| |
+---------+
|
v
Final answer to user
So: agentic defines how the LLM reasons and acts (ReAct vs plan-and-execute); MCP defines how tools and context are provided to that LLM in a standard way.
Summary
| Topic | Takeaway |
|---|---|
| Agentic AI | LLM in a loop: reason → act (tool call) → observe → repeat until answer. |
| ReAct | Interleaved thought and action each turn; good for exploratory tasks. |
| Plan-and-Execute | Plan once (list of steps), then execute; good for predictable pipelines. |
| MCP | Protocol for connecting AI apps to tools/data: Host → Clients → Servers. |
| MCP primitives | Tools (callable), Resources (data), Prompts (templates); discover via list, use via get/call. |
| MCP flow | Initialize → tools/list → (LLM chooses) → tools/call → observation → next turn. |
Using a clear agentic workflow (ReAct or plan-and-execute) plus MCP for tool and context provisioning gives you a standard, composable way to build AI agents that reason and act with external systems.
Comments