Skip to main content

AI Agents

Hire a salesperson. Hire a solutions engineer. Hire an onboarding coach. Hire them all in an afternoon, and have them on duty 24/7 by sundown.

An AI Agent in Turing ES is the specialist you don't have to recruit, train, or replace. It's a small, named bundle of:

  • a brain — an LLM Instance you've already configured,
  • a set of handsnative tools and MCP servers that let the agent actually do things (search your content, fetch live data, run code),
  • a voice — a Persona that aligns the answers with your brand, your funnel stage, and your compliance posture.

Each agent appears as its own tab in the Chat interface. Visitors pick the specialist that matches their need; agents stay in their lane.

Configure agents in Administration → AI Agents.


What "Agent" Actually Means

The word "agent" is overloaded. In Turing ES, an Agent has two non-negotiable properties:

  1. It's a composition, not a model. You're not picking a model from a marketplace and renaming it. You're combining decisions — which LLM, which tools, which voice — into a single, deployable unit. Change any of the three, and you've changed the agent.
  2. It can act, not just answer. The defining feature of an agent over a plain chatbot is its ability to call tools — search a knowledge base, look up a stock quote, run a Python script, query an MCP server. The model doesn't just generate words; it requests an action, Turing ES executes it, and the model reasons over the result.

Together, these mean an agent is a business process expressed as a conversation. A user asks; the agent searches your documentation, fetches their account data, summarizes the answer, and books a follow-up — in one exchange.


Composition: The Four Layers

An AI Agent is built from four layers. Get all four right and the agent feels like a senior team member; get any one wrong and it feels like a generic chatbot.

AI Agent — Composition

1. Identity

FieldDescription
NameDisplay name shown as the tab label in the Chat interface. "Sales Concierge" beats "Agent #3".
AvatarProfile image. Even a small one moves trust meters — visitors react to faces, even illustrated ones.
DescriptionOne-line explanation of the agent's purpose. Shown in admin lists; hints visitors what to expect.
EnabledToggle to activate or hide the agent. Disabled agents do not appear in the Chat interface.

2. Brain

The LLM Instance is the agent's reasoning engine. You can attach one or more instances; at chat time the user (or front-end) picks which one to run.

Why multiple? Because the cost/quality tradeoff varies by question:

  • A trivial routing question doesn't need a frontier model. Attach a fast, cheap instance (e.g., Gemini Flash, GPT-4o-mini) and let users default to it.
  • A complex multi-step deal qualification benefits from a top-tier reasoner. Attach Claude Sonnet alongside, as an opt-in upgrade.

See LLM Instances for vendor support, default models, and provider-specific options.

3. Capabilities

This is the largest configuration surface. Two sets of capabilities can be attached:

Native tools — 27 first-party tools, grouped into 8 categories:

CategoryToolsWhen to attach
Semantic Navigationlist_sites, search_site, get_document_details, find_similar_documents, search_by_date_range, get_site_fields, get_valid_filter_values, … (15 total)The agent should answer using your indexed enterprise content (CMS pages, products, internal documentation)
RAG / Knowledge Basesearch_knowledge_base, list_knowledge_base_files, get_file_content, knowledge_base_statsThe agent should answer using files in Assets (PDFs, Word docs, manuals)
Web Crawlerfetch_webpage, extract_linksThe agent should be able to read a public URL on demand (e.g., reference a customer's website)
Financeget_stock_quote, search_tickerInvestor-facing or financial-research agents
Weatherget_weatherTravel, logistics, field-service agents
Image Searchsearch_imagesMarketing, content-creation agents
Date / Timeget_current_timeAny agent that books, schedules, or references "now"
Code Interpreterexecute_pythonData analysis, chart generation, technical agents

MCP Servers — external tool servers the agent can call:

BadgeTransportTypical use
HTTP (blue)SSE over HTTPA REST API your team owns; a brand-context server; a CRM lookup
COMMAND (amber)stdioA local script or process — useful for filesystem or shell access on the host

See Tool Calling for the full native tool reference and MCP Servers for connection setup.

Lean tool lists win

A common mistake: attaching every tool because "it might be useful". Every additional tool consumes prompt tokens to describe to the model, and forces the model to choose among more options. Fewer, sharper tools produce more precise tool calls and shorter latencies. Start with 3–5 tools per agent and add only what an actual user request demanded.

4. Voice

The Persona field attaches a Persona to the agent. The persona overlays:

  • a system instruction (brand voice directive),
  • tone, verbosity, language style,
  • mandatory + forbidden vocabulary (enforced both pre- and post-LLM),
  • optional few-shot examples retrieved from a vector store,
  • optional live brand context from an MCP.

A persona is optional. An agent without one uses the LLM's default voice. For internal tools that's fine; for any customer-facing agent it's almost always wrong.


How an Agent Actually Runs

When a user sends a message, this loop runs end-to-end. Understanding it is the difference between debugging "the AI gave a weird answer" in five minutes vs. five hours.

AI Agent — Execution Flow

Step 1 — Prompt assembly

Turing ES builds the prompt in this exact order:

  1. Persona system instruction (if a persona is attached)
  2. Style guidelines block — derived from the persona's tone + verbosity + language style
  3. Mandatory / forbidden vocabulary lists (if any)
  4. Brand context — fetched from the persona's MCP server, if configured
  5. Agent system prompt — the agent's own purpose-specific instruction
  6. Tool definitions — JSON schemas for every native tool and MCP tool the agent has access to
  7. Few-shot examples — top-K Q/A pairs retrieved by similarity from the persona's few-shot store, if configured (only on the first user turn)
  8. Conversation history — full message log so far
  9. Current user message

Layers 1–6 don't change per turn after the first message. Spring AI, the underlying integration, handles the assembly; you don't write any of this manually.

Step 2 — LLM inference

The assembled prompt goes to the LLM Instance the user picked. The model receives it and decides one of two things:

  • Respond directly — when the answer doesn't need any tool. Common for chit-chat, restated questions, simple factual recall.
  • Request a tool call — when the answer requires action. The model emits a structured request: tool name + arguments.

This decision is the agent's "thinking step". Tool-capable models (Claude, GPT-4, Gemini) do this well; small or older models struggle and may hallucinate tool calls or skip them entirely.

Step 3 — Tool execution

If the model requested a tool, Turing ES:

  1. Validates the call (does the agent have access to that tool? are the arguments valid JSON?).
  2. Routes it to the correct executor — native tool implementation or MCP server.
  3. Captures the result (or the error).
  4. Logs the invocation: tool name, arguments, latency, response size.

Step 4 — Multi-step reasoning

The tool result is fed back to the model, which can:

  • Call another tool — chain steps. For example: search_knowledge_baseget_file_contentexecute_python to chart the data.
  • Stop and answer — synthesize the results into a final response.

This loop runs until the model decides it has enough. Spring AI's internalToolExecutionEnabled(true) setting handles the loop transparently — the orchestration is built in.

Step 5 — Streaming the response

The final response streams back to the chat UI token by token via Server-Sent Events. The user sees the answer as it's being generated, the same way they'd watch someone typing. This isn't cosmetic — perceived latency drops by half versus waiting for a complete response.

Step 6 — Post-LLM persona enforcement

Before the response leaves the chat executor, TurPersonaToneValidator scans it for forbidden vocabulary from the persona. Matches are masked. The user never sees them. Logs record the event so you know your prompt isn't holding in some scenarios.

Step 7 — Analytics emission

A chat session event is recorded by TurChatAnalyticsService (turn count, token usage, agent ID, persona ID, locale, started-at, completed-at). This is the foundation for Chat Analytics — the dashboard that tells you, days later, which agents are converting and which aren't.


Composing Agents Around the Funnel

The biggest leverage in agent design is funnel mapping: build one agent per stage, not one agent for everything. Each agent gets a tighter system prompt, a more focused tool list, and a persona that fits the moment.

The Discovery Agent

For visitors who arrived from a marketing campaign and don't yet know what they want.

LayerConfiguration
NameDiscovery Concierge
LLMA fast, cheap instance (Gemini Flash, GPT-4o-mini) — discovery is high-volume
Toolssearch_site (your marketing pages), find_similar_documents
MCP ServersOptional: a CRM-lookup MCP that personalizes if the user is logged in
Personatop-of-funnel-sales — EXECUTIVE tone, PERSUASIVE, verbosity 2
System Prompt"You greet first-time visitors. Ask one qualifying question per turn. Don't list features; ask about outcomes. After three turns, offer a 15-min call."

Conversion target: discovery → demo booking.

The Solutions Engineer Agent

For technical evaluators in active comparison.

LayerConfiguration
NameSolutions Engineer
LLMA top-tier reasoner (Claude Sonnet, GPT-4) — evaluators ask hard questions
Toolssearch_knowledge_base, get_file_content, fetch_webpage, execute_python
MCP ServersInternal API documentation MCP
Personasolutions-engineer — TECHNICAL tone, INSTRUCTIONAL, verbosity 4
System Prompt"You answer integration and architecture questions. Lead with the architectural answer. Confirm assumptions before recommending. Cite docs when possible."

Conversion target: evaluation → technical buy-in.

The Onboarding Coach Agent

For customers who just signed up.

LayerConfiguration
NameOnboarding Coach
LLMMid-tier (Sonnet, GPT-4o) — needs to follow conversational state precisely
Toolssearch_knowledge_base (product docs), get_file_content
MCP ServersOptional: account-state MCP that knows where the user is in the wizard
Personaonboarding-coach — CASUAL, INSTRUCTIONAL, verbosity 3
System Prompt"You guide brand-new users through their first day. Show one next step. Wait for confirmation. Celebrate small wins."

Conversion target: signup → activation milestone.

The Internal IT Agent

For employees, on-premises, behind your firewall.

LayerConfiguration
NameIT Helpdesk
LLMA local LLM via Ollama — keeps queries off third-party clouds
Toolssearch_knowledge_base (IT runbook), execute_python, get_current_time
MCP ServersInternal ticketing system (stdio MCP)
PersonaNone — this one optimizes for precision, not voice
System Prompt"You answer IT questions for employees. Cite the runbook section. If the issue isn't covered, open a ticket via the MCP and reply with the ticket ID."

Outcome target: first-call resolution rate.


Configuration Form

The agent form is split into four tabs.

Settings

FieldRequiredDescription
NameYesDisplay name shown as the tab label in the Chat interface
AvatarProfile image — supports upload and removal
DescriptionBrief explanation of the agent's purpose
System PromptInstructions sent as a system message before every conversation. Defines purpose, scope, and behavior.
PersonaOptional Persona — overlays voice, vocabulary, brand context, and few-shot examples
EnabledToggle to activate or deactivate the agent
Default system prompt

If the system prompt is left blank, the agent uses the built-in default:

"You are an AI assistant. Answer the user's questions using the tools available to you. If you have access to MCP server tools, use them when relevant to fulfill the user's request. If the user asks in a specific language, respond in that same language."

This is fine for prototypes. For any customer-facing agent, write a real system prompt.

LLM

Select one or more LLM Instances. The list shows each instance's title, description, vendor, and model name. The user (or the front-end) picks which one to use at chat time, from the agent's allowed set.

Tools

Select which of the 27 native tools are available. Tools are grouped by category — each group has a select-all checkbox for quick configuration.

MCP Servers

Select which external MCP servers this agent can call. The list shows each server's title, description, and connection type.


REST API

Agent Management

MethodEndpointDescription
GET/api/ai-agentList all agents (ordered by title)
GET/api/ai-agent/structureEmpty structure template for a new agent
GET/api/ai-agent/{id}Get a specific agent
POST/api/ai-agentCreate a new agent
PUT/api/ai-agent/{id}Update an existing agent
DELETE/api/ai-agent/{id}Delete an agent

Agent Chat

MethodEndpointDescription
POST/api/v2/ai-agent/{agentId}/chatStream chat response (SSE). Body: { llmInstanceId, messages[] }
GET/api/v2/ai-agent/{agentId}/chat/context-infoContext window size for the chosen LLM. Query: llmInstanceId

Native Tools

MethodEndpointDescription
GET/api/native-toolList all available tool groups, names, and descriptions

Caching

Agent definitions are cached at the repository layer:

  • turAIAgentfindAll — the full list
  • turAIAgentfindById — individual lookups

Entries are invalidated automatically on create / update / delete. This means agent changes propagate immediately; no app restart needed.


Diagnosing a Misbehaving Agent

SymptomLikely causeWhere to look
Agent gives generic answers, doesn't cite your contentNo tool was called; the model didn't know to use oneAdd a sharper system prompt directive: "Always start by calling search_knowledge_base before answering."
Agent uses brand names from a competitor or wrong productForbidden vocabulary not configuredAdd the wrong terms to the persona's forbidden list
Agent's tone changes mid-conversationNo persona attached, or weak system promptAttach a persona; the post-LLM validator and prompt-side directives stabilize voice
Agent calls tools but ignores the resultsModel is too small or the tool result is too largeTry a stronger LLM Instance; or trim the tool's output (e.g., reduce find_similar_documents top-K)
Conversion rate dropping over weeksFew-shot store is stale; brand-context MCP returning old pricesChat Analytics drill-down → check goal-achievement rate per persona
Latency spikes occasionallyA tool is slow; or the LLM provider is rate-limitingObservability dashboards (turing.llm.calls timer)

PageDescription
LLM InstancesConfigure the model providers used as agent brains
Tool CallingThe 27 native tools, grouped and explained
MCP ServersConnect agents to external tool servers
PersonasGive agents a brand voice
ChatThe interface where agents come to life
Chat AnalyticsMeasure which agents are converting
ObservabilityWatch latency, token usage, and tool reliability in real time