We use cookies on this website.

By clicking "Accept," you agree to the storage of cookies on your device to improve your browsing experience, analyze site usage, and contribute to our marketing efforts. See our privacy policy for more information.

Architecture of a Business AI Agent: A Look Under the Hood of RAG, LLM, and the Orchestrator

Inside an AI agent: the 4 key components (LLM, RAG, tools, orchestrator) explained, with definitions, real-world examples, and 5 questions to ask a service provider.

Architecture of a Business AI Agent: A Look Under the Hood of RAG, LLM, and the Orchestrator

When a vendor offers you an “AI solution” for your business, the first question you should ask isn’t about the price. It is: what’s in the box? Because behind the same term lie very different technical architectures that produce very different results. This article breaks down the four building blocks that make up a true business AI agent, with precise definitions of the terms you’ll encounter in meetings: LLM, RAG, function calling, MCP, and orchestrator.

By the end of this article, you’ll know exactly what’s under the hood, you’ll be able to ask the right questions of any service provider, and you’ll be able to spot solutions that are marketed as “AI agents” but aren’t actually AI.

If you're looking to compare the uses of a chatbot and an AI agent to help you decide which one to choose, we have an article dedicated to this business question: AI Agent vs. Chatbot: What Are the Differences for Businesses? Here, we dive into the architecture.

The 4 building blocks of a business AI agent

An AI agent is built on four key components. Not one less. It is the combination of all four that transforms a language model into something useful for businesses.

Building Block 1 — The LLM (Language Model)

Definition. An LLM (Large Language Model) is a program trained on massive text corpora to understand and generate natural language. Claude, GPT-4, Gemini, Mistral, and LLaMA are examples of LLMs. When you type "My VPN won't connect," the LLM interprets the intent behind the sentence and formulates a coherent response in French.

What it offers. Two capabilities that traditional chatbots lacked: the ability to understand natural language without having to guess the exact keywords expected, and the ability to generate a response tailored to the tone, context, and request.

Its limitation. An LLM on its own knows nothing about your company. It was trained on general data, not on your product catalog, your contracts, or your user directory. If you ask it, “Who is in charge of the Dupont case here?”, it will either say it doesn’t know or, worse, come up with a plausible but false answer. This is what’s known as a hallucination.

That’s why a raw LLM isn’t an AI agent. It’s a general-purpose assistant, useful for drafting or summarizing, but oblivious to your context. At the very least, you need to give it access to your data: that’s the role of the second component.

Module 2 — RAG (Retrieval-Augmented Generation)

Definition. RAG, which stands for Retrieval-Augmented Generation, is a technique that involves connecting an LLM to its own knowledge base. Your internal documents, procedures, ticket history, contracts—everything an agent needs to “know” to do their job effectively.

How it works. Your documents are indexed in a vector database, a special type of database that searches not by keywords but by meaning. Specifically, each document is converted into a numerical vector that represents its semantic content. When a question is submitted, it is itself converted into a vector, and the system finds the closest matches in your database. These relevant passages are sent to the LLM along with the question, which uses them to formulate a precise and well-grounded response.

What it offers. Two key benefits. First, it grounds the agent’s responses in your data, which drastically reduces false positives. Second, it allows the agent to adapt without the need for retraining: simply add or modify a document in the database, and the agent takes it into account immediately.

Without RAG, you have an LLM that speaks well but knows nothing about your industry. With a good RAG, you have an agent that responds with the precision of an experienced employee who has your documentation right in front of them.

Building Block 3 — Tools (function calling)

Definition. In the language of AI agents, tools are functions that the LLM can call upon to perform actions within your information system. The technique that enables this is called function calling. Examples of tools include “check a user’s permissions in Active Directory,” “open a ticket in ServiceNow,” “search for a customer in the CRM,” “reset a password,” “send an email,” and “check the status of an order in the ERP.”

How it works. When an agent determines that a request requires an action rather than just a text response, they select the appropriate tool from their catalog, provide the necessary parameters, and trigger the call. The tool’s response is sent back to the agent, who can then formulate a final response for the user or proceed to the next step.

What they offer. It’s the building block that changes everything. Without tools, the agent is limited to text: they can describe the procedure for resetting a password, but they can’t actually do it. With well-integrated tools, they take direct action, and what they deliver to the user is no longer just information—it’s a result. That’s the exact line between informing and resolving.

Building Block 4 — The Orchestrator

Definition. The orchestrator is the logic that decides what to do at each step: query the LLM, search the RAG, call a tool, request human validation, or respond directly to the user. Without the orchestrator, the three previous components cannot communicate with one another.

What it does. Three key functions. First, it manages the conversation history: when a user asks, “And what about Dupont—is it the same?”, the orchestrator knows that “Dupont” refers to a previous exchange and provides that context to the LLM. Next, it coordinates between the components: RAG first, then tools if needed, then the response. Finally, it escalates to a human when the request is too complex, too sensitive, or outside its scope, along with all the context already gathered. That’s what sets apart an agent that provides a service from one that makes risky decisions on your behalf.

MCP: The Standard That Will Transform the Industry in 2026

You’ll be seeing the acronym MCP more and more in discussions about AI agents. Here’s what it is and why it matters.

Definition. MCP, short for Model Context Protocol, is an open standard that simplifies the connection between a language model and external tools. Introduced in late 2024 and widely adopted in 2026, it standardizes the way an LLM "communicates" with a CRM, an ITSM, a database, or any other information system.

What this changes. Before MCP, every integration between an LLM and a system required custom, proprietary, non-reusable development. With MCP, these connections become standardized: adding a new tool to an agent takes a few hours rather than a few days. And a connector written for one LLM works with other compatible LLMs. This is one of the factors that made AI agents much faster to deploy in 2026, and that makes them less dependent on a single vendor.

For a business leader, MCP is the equivalent of what USB was for peripherals in its day: a standard that opens up the ecosystem and lowers integration costs.

The 4 building blocks in action: a real-world example

Let's see how this works in a real-world scenario. Imagine a user writing to an AI helpdesk agent: "Hello, I haven't been able to connect to the VPN since this morning."

Step 1. The orchestrator receives the message and forwards it to the LLM, which identifies the intent: a VPN access issue, likely resolvable at the Level 1 support level.

Step 2. The orchestrator triggers a RAG search in the internal knowledge base. The system retrieves procedures related to VPN issues, the recent history of similar tickets, and any ongoing incidents.

Step 3. Armed with this context, the LLM identifies the most likely cause: an expired password (90% of cases, based on historical data).

Step 4. The orchestrator calls a tool: "Check the status of this user's password in Active Directory." The tool confirms: password expired.

Step 5. The orchestrator prompts the user: "Your password has expired. Would you like me to start the renewal process now?" For sensitive actions, a human must approve them. The user confirms.

Step 6. The orchestrator triggers the appropriate tool to initiate the reset and notifies the user when it is complete.

The whole process takes just a few seconds. At no point was a human technician involved, yet the user received a result that was executed in the actual system and logged as a regular ticket.

That’s exactly what’s happening on our internal helpdesk with Helpy. It has enabled us to reduce our monthly Level 1 support hours from 270 to 141, with 65% of tickets resolved by users on their own. You can find the full details of our experience in the Helpy case study, and the product overview on our AI helpdesk solutions page.

5 Questions to Ask an AI Service Provider

Now that you understand the structure, here are five specific questions that will immediately reveal whether you’re being offered a genuine agent or just a flimsy setup.

1. Which LLM do you use, and can I switch to a different one? A good provider will explain their choice of LLM (Claude, GPT, Mistral, or Gemini, depending on the situation) and allow you to switch, rather than locking you into a single technology.

2. How does the agent access my specific data? If the answer is "they rely on their general knowledge" or "you will provide them with the information," there is no RAG, and they are not a domain-specific agent.

3. What specific tools can the agent actually run in my IT system? Ask for the exact list: Active Directory, ITSM, CRM, ERP, email. If the answer is "it directs the user to the correct form," there is no function calling.

4. How does the agent handle sensitive actions? A well-designed AI agent proposes risky actions and allows a human to approve them. If it acts without safeguards, you are taking on operational and legal risks.

5. Who owns my knowledge base, my prompts, and my data? The answer should be "you." If the service provider retains ownership, you are in a position of dependency.

Asking these five questions at the start of your discussion will help you avoid most of the common pitfalls when making purchases in the AI market in 2026.

The Essentials

A domain-specific AI agent is an architecture composed of four building blocks: an LLM that understands and formulates, a RAG that anchors it in your data, tools that enable it to interact with your systems, and an orchestrator that coordinates the whole system. With MCP emerging as the de facto integration standard, these architectures are being deployed faster and more seamlessly than ever before.

Remove one brick, and you’re left with a tool that describes rather than acts, that talks rather than solves. The distinction isn’t theoretical: it shows up in the results, and it can be verified with a few targeted questions at the start of the project.

To understand which type of AI project best suits your needs (off-the-shelf tool, business agent, or strategic project), our in-depth article details the three levels of AI projects in the enterprise. For specific costs by level, see "How Much Does an AI Agent Cost in the Enterprise?"

Frequently asked questions

What is an LLM?

An LLM (Large Language Model) is a program trained to understand and generate text in natural language. Claude, GPT-4, Gemini, and Mistral are examples of LLMs. In an AI agent, the LLM interprets requests and formulates responses. But on its own, it only has access to its general knowledge: it must be connected to a RAG and other tools to become a true business agent.

What is RAG?

RAG stands for Retrieval-Augmented Generation. It is a technique that involves enriching a language model with its own data, stored in a vector database. When a question is asked, the system searches for relevant passages in this database and sends them to the model, which responds based on this specific information rather than on its general knowledge. This is what anchors the agent in your context and drastically reduces hallucinations.

What is an AI hallucination?

A hallucination is a plausible but incorrect response generated by a language model. It occurs when the LLM lacks the requested information and invents it rather than acknowledging its lack of knowledge. Hallucinations are one of the main problems with LLMs used on their own. A well-designed AI agent avoids them through three mechanisms: RAG, which anchors responses in your real data; tools that verify information in real time; and an orchestrator that knows when the agent should escalate to a human rather than make something up.

What is function calling?

Function calling is the technique that allows an LLM to execute external functions—that is, to perform actions within an information system rather than simply responding with text. In practice, the LLM receives a description of the available functions (checking permissions, opening a ticket, sending an email), and it decides on its own when to call them and with which parameters. This is what transforms an LLM into an agent capable of taking action.

What is MCP in AI?

MCP, short for Model Context Protocol, is an open standard introduced in late 2024 that simplifies the connection between a language model and external tools. It standardizes the way an LLM "communicates" with a CRM, an ITSM, a database, or any other system. As a result, adding a new tool to an agent takes a few hours rather than a few days, and connectors become reusable across different LLMs. This is one of the factors that accelerated the deployment of AI agents in 2026.

What is a vector basis?

A vector database is a specialized database that stores information in the form of numerical vectors representing their semantic content. It does not search by keywords but by semantic proximity: two sentences that mean the same thing but use different words will be identified as similar. This is the component that enables RAG to find relevant passages in your documentation, even when the user formulates their question using different words than those used in your documents.

What is the difference between an LLM and an AI agent?

An LLM is a component: the language model that understands and generates text. An AI agent is a complete architecture that combines an LLM with three other essential components: a RAG to access your specific data, tools to execute actions within your systems, and an orchestrator to coordinate the whole system. The LLM alone is a general-purpose assistant; the AI agent is a system that handles an end-to-end business process.

How long does it take to deploy an AI agent?

An AI agent designed for a specific use case typically takes four to twelve weeks to deploy, depending on its complexity. The determining factor is not the technology itself, but the state of your existing knowledge base and access to your systems. For details on costs and timelines, see our article “How Much Does an AI Agent Cost for a Business?”

The next step

Do you want to assess how an AI project fits into your context, or evaluate a solution proposed by a vendor? Our 30-minute express audit provides a clear overview of your needs across the three levels of AI projects, along with a cost estimate. It’s free and there’s no obligation.

Book my express audit 30 minutes · free · no obligation

Our latest articles

See more
Cybersecurity

Phishing in 2026: Why 82% of Companies Will Fall Victim This Year (and How to Avoid Being One of Them)

Spear phishing, BEC, voice deepfakes: why training alone isn’t enough, the true cost of an incident (€275,000), and the security measures that will work in 2026
June 10, 2026
ModernWork
Cybersecurity
Data & AI

Microsoft Purview: The Comprehensive Data Governance Solution for the Multicloud Era

Cataloging, automated classification, GDPR compliance: how Microsoft Purview unifies governance of your multicloud, on-premises, and SaaS data.
June 10, 2026
Data & AI
ModernWork

Microsoft Copilot: The AI That’s Really Transforming Business Productivity (Or Not)

Should you invest in Microsoft Copilot? Realistic ROI, hidden costs, Copilot vs. ChatGPT, and successful deployment: what salespeople aren’t telling you.
June 10, 2026