quyennv.com

Senior DevOps Engineer · Healthcare, Fanance

Detecting…

AI Agentic Systems and Model Context Protocol (MCP): Architecture, Workflows, and Examples

#ai#agents#MCP#LLM#tools#architecture#devops

0

AI agentic systems use language models plus tools and a control loop to reason and act in the world. The Model Context Protocol (MCP) is a standard way for AI applications to connect to external tools and data. This document covers both: agentic architecture and workflows (ReAct, plan-and-execute), then MCP’s architecture, data flow, and practical examples.


Part 1: AI Agentic Systems

What is agentic AI?

Agentic AI refers to systems where an LLM (or other model) drives a loop: it receives context (user input, history, tool results), reasons about what to do, chooses actions (often tool calls), and observes results before deciding the next step. The agent continues until it reaches a final answer or a stopping condition.

Key traits:

  • Autonomy: The model decides which actions to take, not a fixed script.
  • Tool use: Actions are often calls to external tools (APIs, code, search, databases).
  • Loop: Reason → Act → Observe → repeat.
  • State: Conversation history and tool results form the state the model conditions on.

High-level architecture

                    +------------------+
                    |   User / Task    |
                    |   (goal, query)  |
                    +--------+---------+
                             |
                             v
    +----------------------------------------------------------------------+
    |                     AGENTIC CONTROL LOOP                             |
    |  +-------------+    +-------------+    +-------------+               |
    |  |   Memory    |    |    LLM      |    | Tool        |               |
    |  | (history,   |<-->| (reason +   |--->| Registry    |               |
    |  |  context)   |    |  decide)    |    | (tools)     |               |
    |  +-------------+    +-------------+    +-------------+               |
    |                            |                  |                      |
    |                            | tool call        | execute              |
    |                            v                  v                      |
    |                     +-------------+    +-------------+               |
    |                     | Observation |<---|  Tool / Env |               |
    |                     |  (result)   |    |  (API, code)|               |
    |                     +-------------+    +-------------+               |
    |                            |                                         |
    |                            +---------> back to LLM (next turn)       |
    +----------------------------------------------------------------------+
ComponentRole
LLMReceives prompt (task + history + tool results), outputs reasoning and/or tool calls.
MemoryStores conversation and tool outputs so the agent has context across turns.
Tool registryList of callable tools with names, descriptions, and input schemas.
Tool executionRuns the chosen tool (API, code, search) and returns the result.
ObservationResult of the tool call (or environment) fed back into the next LLM turn.

Agentic workflows

ReAct (Reason + Act)

ReAct interleaves reasoning (thought) and action (tool use). Each step is either a Thought, an Action, or an Observation. The loop continues until the model outputs a final answer.

Workflow:

  1. Thought: The model reasons in natural language (e.g. “I need to get the current weather for Paris”).
  2. Action: The model outputs a structured tool call (e.g. weather(location="Paris")).
  3. Observation: The environment returns the tool result (e.g. “Paris: 18°C, cloudy”).
  4. Repeat from step 1 until the model decides it can answer the user.
                    ReAct loop (one turn)
    +------------------------------------------------------------+
    | Turn N                                                     |
    | ---------------------------------------------------------- |
    | Thought: "User asked for weather in Paris. I'll call       |
    | the weather tool."                                         |
    | Action:  weather_current(location="Paris", units="metric") |
    | Observation: "Paris: 18°C, cloudy, wind 12 km/h"           |
    +------------------------------------------------------------+
                             |
                             v
    +------------------------------------------------------------+
    |  Turn N+1                                                  |
    |  Thought: "I have the data. I can summarize for the user." |
    |  Answer: "In Paris it's currently 18°C and cloudy."        |
    |  [FINISH]                                                  |
    +------------------------------------------------------------+

When to use ReAct: Exploratory tasks, when the next step depends on previous results, or when you need to retry or branch (e.g. search then read, then maybe search again).

Plan-and-Execute

In plan-and-execute, the model first produces a plan (ordered list of steps), then a separate execution phase runs those steps (often with an LLM or code per step). Planning and execution are decoupled.

Workflow:

  1. Plan: Given the user goal, the planner LLM outputs a step-by-step plan (e.g. “1. Search for X, 2. Read top result, 3. Summarize”).
  2. Execute: For each step, the executor runs the corresponding action (tool or sub-task) and records the result.
  3. Finish: When all steps are done, combine results into a final answer (optionally with a final LLM pass).
    +----------------+     +----------------------------------------+
    |  User goal     |     |  Planner (LLM)                         |
    |  "Compare      |---->|  Plan:                                 |
    |   product A    |     |  1. search("product A specs")          |
    |   and B"       |     |  2. search("product B specs")          |
    |                |     |  3. compare(specs_A, specs_B)          |
    +----------------+     |  4. format comparison for user         |
                           +------------------+---------------------+
                                              |
                                              v
    +----------------+     +----------------------------------------+
    |  Final answer  |     |  Executor (per step)                   |
    |  "Product A    |<----|  Step 1 -> tool -> result_1            |
    |   has X, B     |     |  Step 2 -> tool -> result_2            |
    |   has Y..."    |     |  Step 3 -> compare -> result_3         |
    +----------------+     |  Step 4 -> LLM(result_3) -> answer     |
                           +----------------------------------------+

When to use plan-and-execute: More predictable, multi-step workflows; can use a smaller/cheaper model for planning and a stronger one for execution or summarization.

ReAct vs plan-and-execute

AspectReActPlan-and-Execute
FlowInterleave thought and action every turn.Plan once, then execute steps.
AdaptivityCan change plan after each observation.Plan is fixed unless you re-plan.
CostMore LLM calls (one or more per step).Fewer planner calls; executor can be cheaper.
Best forExploratory, dynamic, or ambiguous tasks.Structured, predictable pipelines.

Agentic example: ReAct-style loop

Simplified flow for a “search and summarize” agent:

User: "What's the latest version of Kubernetes?"

Step 1 (Thought): I need to search for the latest Kubernetes version.
Step 1 (Action):   search(query="Kubernetes latest version 2024")
Step 1 (Observation): "Kubernetes 1.31 released 2024-08-..." 

Step 2 (Thought): I found it. I'll give a short answer.
Step 2 (Answer):   "The latest stable version is Kubernetes 1.31 (released August 2024)."
[HALT]

Pseudocode for the control loop:

function runAgent(userQuery):
  messages = [systemPrompt, userMessage(userQuery)]
  while true:
    response = llm(messages)
    if response has "Answer:" or "Final answer:":
      return extractAnswer(response)
    if response has tool call (name, args):
      result = executeTool(name, args)
      messages.append(assistantMessage(response))
      messages.append(userMessage("Observation: " + result))
    else:
      messages.append(assistantMessage(response))

The system prompt typically includes: (1) description of the ReAct format (Thought / Action / Observation / Answer), (2) list of available tools with names and schemas, (3) rules (e.g. one tool call per turn, always reason first).


Part 2: Model Context Protocol (MCP)

What is MCP?

The Model Context Protocol (MCP) is a standard protocol for exchanging context between AI applications and external servers. It defines how tools, resources, and prompts are discovered, read, and invoked over a client–server connection, so that LLMs can use external capabilities in a uniform way.

Goals:

  • Standard interface: Same protocol for filesystem, databases, APIs, custom tools.
  • Composability: One host (e.g. an IDE or chatbot) connects to many MCP servers.
  • Separation: Servers are independent; the host aggregates their tools and context.

MCP architecture: participants

MCP uses three roles:

ParticipantRole
MCP HostThe AI application (e.g. Claude Desktop, VS Code, your app). Coordinates clients and uses aggregated context/tools with the LLM.
MCP ClientCreated by the host; maintains one connection to one MCP server. Discovers and invokes that server’s primitives.
MCP ServerProgram that exposes tools, resources, and/or prompts. Can run locally (e.g. STDIO) or remotely (e.g. HTTP).

One host has many clients; each client talks to one server. A remote server can serve many clients; a local STDIO server usually serves one client.

                    +------------------------------------------+
                    |            MCP HOST (AI app)             |
                    |  (Claude Desktop, VS Code, your agent)   |
                    |                                          |
                    |  +-----------+ +-----------+ +----------+|
                    |  | MCP       | | MCP       | | MCP      ||
                    |  | Client 1  | | Client 2  | | Client 3 ||
                    |  +-----+-----+ +-----+-----+ +----+-----+|
                    +--------|------------|------------|-------+
                             |            |            |
              dedicated      |            |            |    dedicated
              connection     |            |            |    connection
                             v            v            v
                    +-------------+ +------------+ +-------------+
                    | MCP         | | MCP        | | MCP         |
                    | Server A    | | Server B   | | Server C    |
                    | (filesystem)| |(database)  | | (remote API)|
                    | local       | | local      | | remote      |
                    +-------------+ +------------+ +-------------+

MCP layers

MCP has two layers:

  1. Transport layer
    How bytes are sent: STDIO (local process) or Streamable HTTP (remote). Handles connection, framing, and auth.

  2. Data layer
    JSON-RPC 2.0 over that transport. Defines lifecycle (initialize, capabilities), primitives (tools, resources, prompts), and notifications (e.g. tools/list_changed).

    +----------------------------------------------------------+
    |  DATA LAYER (JSON-RPC 2.0)                               |
    |  - initialize / capabilities                             |
    |  - tools/list, tools/call                                |
    |  - resources/list, resources/read                        |
    |  - prompts/list, prompts/get                             |
    |  - notifications (e.g. tools/list_changed)               |
    +----------------------------------------------------------+
    +----------------------------------------------------------+
    |  TRANSPORT LAYER                                          |
    |  - STDIO (local process)                                  |
    |  - Streamable HTTP (remote, auth, SSE)                    |
    +----------------------------------------------------------+

MCP primitives (what servers expose)

PrimitivePurposeTypical use
ToolsCallable functions the model can invoke.Run code, call APIs, query DB, run CLI.
ResourcesRead-only or dynamic data (files, DB views, API responses).Inject file contents, schema, config into context.
PromptsReusable prompt templates (e.g. system prompts, few-shot).Standardize how the host prompts the LLM.

Clients discover via */list and use via */get or tools/call. Servers can send notifications (e.g. “my tool list changed”) so the host refreshes.


MCP workflow: lifecycle and tool use

1. Initialization (lifecycle)

Client connects and sends initialize with protocol version and capabilities; server responds with its capabilities (e.g. tools, resources). Then client sends notifications/initialized. After that, the session is ready.

    Client                              Server
       |                                  |
       |  initialize(protocolVersion,     |
       |    capabilities, clientInfo)     |
       |--------------------------------->|
       |  result(serverInfo, capabilities)|
       |<---------------------------------|
       |  notifications/initialized       |
       |--------------------------------->|
       |  (session ready)                 |

2. Tool discovery and execution

Host asks each server what tools it has, then the LLM can call them.

Discover: Client sends tools/list; server returns a list of tools with name, description, and inputSchema (JSON Schema).

Execute: When the model chooses a tool, the host sends tools/call with name and arguments. Server runs the tool and returns content (e.g. text or structured data).

    Host / LLM                MCP Client              MCP Server
         |                         |                         |
         |  "User wants weather"   |                         |
         |------------------------>|                         |
         |                         |  tools/list             |
         |                         |------------------------>|
         |                         |  { tools: [ {...} ] }   |
         |                         |<------------------------|
         |  LLM picks tool + args  |                         |
         |<------------------------|                         |
         |                         |  tools/call(name, args) |
         |                         |------------------------>|
         |                         |  result(content)        |
         |                         |<------------------------|
         |  observation for next   |                         |
         |  LLM turn               |                         |
         |<------------------------|                         |

3. Notifications

Server can push notifications/tools/list_changed (and similar for other primitives). The client then re-fetches tools/list so the host’s tool registry stays up to date.


MCP data layer example (JSON-RPC)

Initialize (client → server)

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "initialize",
  "params": {
    "protocolVersion": "2024-11-05",
    "capabilities": { "elicitation": {} },
    "clientInfo": { "name": "my-ai-app", "version": "1.0.0" }
  }
}

Server response (capabilities)

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "protocolVersion": "2024-11-05",
    "capabilities": {
      "tools": { "listChanged": true },
      "resources": {}
    },
    "serverInfo": { "name": "weather-server", "version": "1.0.0" }
  }
}

tools/list (discovery)

Request:

{ "jsonrpc": "2.0", "id": 2, "method": "tools/list" }

Response (simplified):

{
  "jsonrpc": "2.0",
  "id": 2,
  "result": {
    "tools": [
      {
        "name": "weather_current",
        "description": "Get current weather for a location",
        "inputSchema": {
          "type": "object",
          "properties": {
            "location": { "type": "string", "description": "City or coordinates" },
            "units": { "type": "string", "enum": ["metric", "imperial"], "default": "metric" }
          },
          "required": ["location"]
        }
      }
    ]
  }
}

tools/call (execution)

Request:

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tools/call",
  "params": {
    "name": "weather_current",
    "arguments": { "location": "Singapore", "units": "metric" }
  }
}

Response:

{
  "jsonrpc": "2.0",
  "id": 3,
  "result": {
    "content": [
      { "type": "text", "text": "Singapore: 31°C, humidity 78%, partly cloudy." }
    ]
  }
}

Putting it together: agentic + MCP

In a typical setup:

  1. MCP host = your agentic application (or IDE with an AI that can use tools).
  2. MCP servers = provide the tools (and optionally resources/prompts) the agent needs (search, weather, DB, filesystem).
  3. Agent loop = ReAct or plan-and-execute: LLM sees user message + history + tool results, outputs thought and/or tool call; host maps tool call to tools/call on the right MCP client, gets back observation, and feeds it into the next turn.
    User query
         |
         v
    +---------+     tools/list (at start)     +-------------+
    |  Agent  |<----from all MCP clients----->| MCP Server 1|
    |  (LLM + |                               | MCP Server 2|
    |   loop) |     tools/call(name, args)    | ...         |
    |         |------------------------------>|             |
    |         |<-- result as observation -----|             |
    +---------+
         |
         v
    Final answer to user

So: agentic defines how the LLM reasons and acts (ReAct vs plan-and-execute); MCP defines how tools and context are provided to that LLM in a standard way.


Summary

TopicTakeaway
Agentic AILLM in a loop: reason → act (tool call) → observe → repeat until answer.
ReActInterleaved thought and action each turn; good for exploratory tasks.
Plan-and-ExecutePlan once (list of steps), then execute; good for predictable pipelines.
MCPProtocol for connecting AI apps to tools/data: Host → Clients → Servers.
MCP primitivesTools (callable), Resources (data), Prompts (templates); discover via list, use via get/call.
MCP flowInitialize → tools/list → (LLM chooses) → tools/call → observation → next turn.

Using a clear agentic workflow (ReAct or plan-and-execute) plus MCP for tool and context provisioning gives you a standard, composable way to build AI agents that reason and act with external systems.

← All posts

Comments