💡

TL;DR: Solve AI Agent Knowledge Routing! Discover the Agentic RAG Router (Single-Step) – your step-by-step guide to building AI agents that intelligently choose between private data (stored in Azure AI Search) and the web (Bing Search) using Azure OpenAI function calling.

Introduction: The Rise of Agentic RAG and Intelligent Routing

Welcome to the first installment of my new blog series, "Agentic RAG Architectures to Build AI Agents using Azure AI Services"! Inspired by the insightful work from analyticsvidhya on the evolution of Agentic RAG systems, we're diving deep into practical architectures you can build today using the power of Azure AI.

If 2023 was the year of Retrieval-Augmented Generation (RAG) and 2024 saw the emergence of sophisticated Agentic RAG workflows, then 2025 is shaping up to be the Year of the AI Agent. These autonomous systems, capable of making decisions and executing tasks with minimal human intervention, are poised to revolutionize how we interact with information and build intelligent applications.

But as AI agents become more complex, the need for intelligent information routing becomes critical. How do we ensure our agents access the right knowledge at the right time? This is where Agentic RAG Routers come into play.

In this blog post, we'll explore the Agentic RAG Router (Single-Step) architecture – a foundational yet powerful approach for building AI agents that can dynamically choose between different knowledge sources to answer user queries effectively. We'll specifically focus on how you, as AI Engineers, can leverage Azure OpenAI's function calling lifecycle and Azure AI Search to implement this architecture and empower your customers.

Agentic RAG Router: Directing Queries for Optimal Answers

Imagine an AI agent as a helpful assistant. When you ask a question, this assistant needs to decide where to look for the answer. Should it consult internal documents, search the web, or perhaps query a database? An Agentic RAG Router equips your AI agent with this crucial decision-making ability.

At its core, an Agentic RAG Router is an intelligent system that dynamically routes user queries to the most appropriate tool or data source. This "routing" is driven by the agent's understanding of the query and the capabilities of the available tools. By combining retrieval mechanisms with the generative power of Large Language Models (LLMs), Agentic RAG Routers ensure users receive accurate and contextually rich responses.

The Single-Step Agentic RAG Router is the simplest form of this architecture, featuring a centralized agent responsible for all routing decisions. It's perfect for scenarios where you have a defined set of knowledge sources and want a streamlined approach to query handling.

Think of it like this:

Imagine a receptionist in a company with different departments (Sales, Support, Legal). A Single-Step Agentic RAG Router acts like a skilled receptionist who, upon receiving an incoming call (user query), instantly understands the caller's need and directly connects them to the appropriate department (tool/data source) in a single step.

In short, there is always the right tool for the job!

There's always the right Tool for the job!

Let's break down the architecture of a Single Agentic RAG Router, as illustrated below for the use case we’re going to tackle.

Hands-on with Agentic RAG Router (Single-Step) using Azure OpenAI and Azure AI Search

Let's walk through a practical example of building a Single-Step Agentic RAG Router using Azure services. We'll demonstrate how to create an agent that can intelligently choose between searching a private knowledge base in Azure AI Search and the public web via Bing Search, all orchestrated by Azure OpenAI function calling.

Use Case: FIFA Referee - Instant Access to Rules and Real-time Events > Ever wonder how FIFA referees make those split-second decisions under immense pressure? As a passionate fan (GO Chelsea FC!), I often ponder how much easier their job would be with instant access to both real-time game data and the official FIFA rulebook. This need to quickly access and synthesize information from diverse sources is exactly the problem Agentic RAG Routers are designed to solve, enabling smarter decisions in complex scenarios – just like a top ref (hopefully!). For this blog post, we'll walk through a practical example focused on this scenario.

Prerequisites:

Before you begin, ensure you have the following Azure and environment configurations set up:

Azure OpenAI Service Instance: You'll need access to Azure OpenAI Service with a deployed chat model like gpt-4o or similar models that support function calling. You can create one through the Azure portal. Note your AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT, and the deployment name for your chat model (AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME).
Azure AI Search Service: You need an Azure AI Search service instance with a pre-existing search index populated with your private data. For this example, we'll assume you have an index named fifa-legal-handbook. Note your AZURE_SEARCH_SERVICE_ENDPOINT, AZURE_SEARCH_ADMIN_KEY, and SEARCH_INDEX_NAME. You can follow the Azure AI Search documentation to create a service and index. I’ve simply created an index of the FIFA Legal Handbook.
Bing Search API Key: To enable web search, you'll need a Bing Search API subscription and API key. Obtain your BING_SEARCH_API_KEY.
Python Environment: Ensure you have Python installed and the required packages. We provide a requirements.txt file to easily install the necessary libraries.

Let's get coding! We'll walk through the steps using a Jupyter Notebook format for clarity (full notebook is available: azure-ai-agents-playground/samples/04-AGENTIC-RAG-ROUTERS/basic-agentic-rag.ipynb at main · farzad528/azure-ai-agents-playground

Step 1: Setup and Imports

First, we set up our Python environment by importing necessary libraries and configuring access to Azure services and Bing Search.

import os
import json
import requests
from rich.console import Console
from rich.panel import Panel

# Azure OpenAI configuration - **IMPORTANT: Set these as environment variables for security!**
AZURE_OPENAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY", "your-azure-openai-api-key")
AZURE_OPENAI_API_VERSION = os.getenv("AZURE_OPENAI_API_VERSION", "2024-10-21")
AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT", "https://your-azure-openai-endpoint.openai.azure.com/")
AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME = os.getenv("AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME", "gpt-4o")  # Update with your deployment name

# Azure AI Search configuration - **IMPORTANT: Set these as environment variables for security!**
AZURE_SEARCH_ENDPOINT = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT", "https://your-search-service.search.windows.net")
AZURE_SEARCH_KEY = os.getenv("AZURE_SEARCH_ADMIN_KEY", "your-azure-search-key")
SEARCH_INDEX_NAME = "fifa-legal-handbook" # Update with your index name

# Bing Search configuration - **IMPORTANT: Set these as environment variables for security!**
BING_SEARCH_API_KEY = os.getenv("BING_SEARCH_API_KEY", "your-bing-search-api-key")
BING_SEARCH_ENDPOINT = "https://api.bing.microsoft.com/v7.0/search"

# Import Azure libraries
from azure.search.documents import SearchClient
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.models import VectorizableTextQuery

# Import Azure OpenAI client
from openai import AzureOpenAI

console = Console()

# Initialize Azure OpenAI client
openai_client = AzureOpenAI(
    api_key=AZURE_OPENAI_API_KEY,
    api_version=AZURE_OPENAI_API_VERSION,
    azure_endpoint=AZURE_OPENAI_ENDPOINT,
)

💡

Pro-Tip: For enhanced security, always store your API keys and service endpoints as environment variables instead of hardcoding them directly in your code.

pip install -r requirements.txt

Step 2: Define Search Functions (Tools for our Agent)

Next, we define the core search functions that our Agentic RAG Router will utilize as its "tools":

def search_azure_ai_search(query: str) -> str:
    """
    Searches the Azure AI Search index 'fifa-legal-handbook' using hybrid semantic approach.
    This function retrieves legal rules, regulations, and disciplinary information from the FIFA Legal Handbook.
    Intended for assisting FIFA referees with queries about legal guidelines.
    """
    credential = AzureKeyCredential(AZURE_SEARCH_KEY)
    client = SearchClient(
        endpoint=AZURE_SEARCH_ENDPOINT,
        index_name=SEARCH_INDEX_NAME,
        credential=credential,
    )

    results = client.search(
        search_text=query,
        vector_queries=[
            VectorizableTextQuery(
                text=query, k_nearest_neighbors=50, fields="text_vector"
            )
        ],
        query_type="semantic",
        semantic_configuration_name="default",
        search_fields=["chunk"],
        top=50,
        include_total_count=True,
    )

    retrieved_texts = []
    for result in results:
        content = result.get("chunk", "")
        retrieved_texts.append(content)

    context_str = (
        "\n".join(retrieved_texts) if retrieved_texts else "No documents found."
    )
    console.print(
        Panel(f"Tool Invoked: Azure AI Search\nQuery: {query}", style="bold yellow")
    )
    return context_str


def search_bing(query: str) -> str:
    """
    Searches Bing using the Bing Search API.
    Returns a concatenated string of result snippets.
    """
    headers = {"Ocp-Apim-Subscription-Key": BING_SEARCH_API_KEY}
    params = {"q": query, "textDecorations": True, "textFormat": "Raw"}
    response = requests.get(BING_SEARCH_ENDPOINT, headers=headers, params=params)

    if response.status_code == 200:
        data = response.json()
        if "webPages" in data and "value" in data["webPages"]:
            snippets = [item.get("snippet", "") for item in data["webPages"]["value"]]
            result_text = "\n".join(snippets)
        else:
            result_text = "No Bing results found."
    else:
        result_text = f"Bing search failed with status code {response.status_code}."

    console.print(
        Panel(f"Tool Invoked: Bing Search\nQuery: {query}", style="bold magenta")
    )
    return result_text

search_azure_ai_search(query: str): This function searches your private knowledge base indexed in Azure AI Search. It performs a hybrid search, leveraging both vector and semantic search capabilities for optimal retrieval of relevant chunks from your documents.
search_bing(query: str): This function utilizes the Bing Search API to search the public web. It's designed to fetch snippets from web results, providing access to up-to-date and publicly available information.

Both functions are designed to be self-contained and include print statements using the rich library to visually indicate when each tool is invoked and the query being processed – helpful for debugging and understanding the agent's behavior.

Step 3: Define Function Schemas for OpenAI Function Calling

Now, we need to tell Azure OpenAI about the functions (tools) our agent can use. We do this by defining function schemas in JSON format. These schemas describe the function names, parameters, and descriptions, enabling the LLM to understand when and how to use these tools.

# Define function schema for Azure AI Search
azure_search_function_schema = {
    "name": "search_azure_ai_search",
    "description": "Searches the Azure AI Search index 'fifa-legal-handbook' for legal rules and regulations.",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "The search query to use to retrieve legal documents from the FIFA handbook.",
            }
        },
        "required": ["query"],
    },
}

# Define function schema for Bing Search
bing_search_function_schema = {
    "name": "search_bing",
    "description": "Searches the web using Bing to retrieve up-to-date information.",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "The search query to use for Bing web search.",
            }
        },
        "required": ["query"],
    },
}

# Combine function schemas into a list
functions = [
    azure_search_function_schema,
    bing_search_function_schema,
]

We define two schemas, one for each search function (search_azure_ai_search and search_bing). Each schema clearly describes:

name: The name of the function (must match the Python function name).
description: A concise description of what the function does. This is crucial for the LLM to understand the tool's purpose.
parameters: Specifies the input parameters the function expects, including their type and description.

Finally, we combine these schemas into a functions list, which we'll pass to the Azure OpenAI client.

Step 4: Agent Logic and Orchestration with OpenAI Function Calling

This is where the magic happens! We'll use Azure OpenAI's function calling capability to create the agentic logic. The LLM will act as the "router," deciding which search tool to use based on the user query and the function descriptions we provided.

def agentic_rag_router_step(user_query):
    """
    Orchestrates the Agentic RAG Router in a single step using OpenAI function calling.
    The agent decides which search tool (Azure AI Search or Bing Search) to use based on the user query.
    """
    response = openai_client.chat.completions.create(
        model=AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME,
        messages=[{"role": "user", "content": user_query}],
        functions=functions, # Pass the list of function schemas
        function_call="auto", # Let OpenAI decide when to call a function
    )

    # Check if the model decided to call a function
    if response.choices[0].message.function_call:
        function_call = response.choices[0].message.function_call
        function_name = function_call.name

        # Extract function arguments
        function_args = json.loads(function_call.arguments)
        query_to_tool = function_args.get("query")

        console.print(
            Panel(f"[bold blue]Function Call:[/bold blue] {function_name}", style="blue")
        )

        # Call the appropriate function based on the model's choice
        if function_name == "search_azure_ai_search":
            search_results = search_azure_ai_search(query=query_to_tool)
        elif function_name == "search_bing":
            search_results = search_bing(query=query_to_tool)
        else:
            search_results = "Tool selection error."

        # Get the LLM to generate a final answer using the search results
        second_response = openai_client.chat.completions.create(
            model=AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME,
            messages=[
                {"role": "user", "content": user_query},
                response.choices[0].message, # Function call response
                {"role": "function", "name": function_name, "content": search_results}, # Function execution result
            ],
        )
        return second_response.choices[0].message.content
    else:
        # If no function call, return the direct response from the LLM
        return response.choices[0].message.content

In the agentic_rag_router_step function:

Initial LLM Call with Function Schemas: We send the user_query to Azure OpenAI along with the functions list and function_call="auto". This instructs the LLM to analyze the query and decide if it needs to call any of the provided functions.
Function Call Check: We check if the response from Azure OpenAI includes a function_call. If it does, it means the LLM has decided to use one of our tools.
Extract Function Details: We extract the function_name and function_args from the function_call object.
Tool Invocation: Based on the function_name chosen by the LLM (either search_azure_ai_search or search_bing), we call the corresponding Python function with the extracted query_to_tool as input.
Second LLM Call for Final Answer: Crucially, we make a second call to Azure OpenAI. This time, we provide the original user_query, the LLM's initial response (which includes the function call decision), and the search_results from the tool execution as the content of a "function" message. This allows the LLM to synthesize the retrieved information and generate a final, coherent answer.
Direct LLM Response (No Function Call): If the initial LLM response doesn't include a function_call, it means the LLM believes it can answer the query directly without using any tools. In this case, we return the direct text response from the LLM.

Step 5: Testing the Agentic RAG Router

Let's put our Agentic RAG Router to the test with some example queries! These examples will showcase the agent's ability to route questions to the appropriate tool – either Azure AI Search for private knowledge or Bing Search for public web information.

💡

In this Single-Step Agentic RAG Router, the agent is designed to select one tool at a time. It will choose either Azure AI Search or Bing Search based on its assessment of the query, but it does not synthesize information from both sources simultaneously in this architecture. We’ll cover a Multi-Step use case in the next blog!

# Example queries to test the Agentic RAG Router
queries = [
    "What are the rules about player conduct in FIFA?", # Should use Azure AI Search (FIFA handbook)
    "Who won the most recent World Cup?", # Should use Bing Search (current events)
    "What are the most recent controversies in 2025 involving FIFA's Football Agent Regulations?", # Likely Bing Search (general knowledge)
    "What are the titles of the FIFA regulations that are contained in the FIFA legal hand book, and what are the edition years of those regulations?", # Azure AI Search (specific doc context)
    "The UEFA Champions League Final is in Munich, what's the weather forecast that day?", # Bing Search (real-time info)
]

for query in queries:
    console.rule(f"[bold green]User Query:[/bold green] {query}")
    agent_response = agentic_rag_router_step(query)
    console.print(Panel(agent_response, title="[bold blue]Agent Response[/bold blue]", style="blue"))
    print("\n") # Add extra newline for readability

Example 1: FIFA Regulations Query (Expected: Azure AI Search)

# Example usage: I expect this query ot trigger the Azure AI Search function, let's see!
run_basic_agent("What are the titles of the FIFA regulations that are contained in the FIFA legal hand book, and what are the edition years of those regulations?")

╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Tool Invoked: Azure AI Search                                                                                   │
│ Query: FIFA regulations titles and edition years                                                                │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭──────────────────────────────────────── Function Call & Final Response ─────────────────────────────────────────╮
│ Function Called: search_azure_ai_search                                                                         │
│                                                                                                                 │
│ Final Answer: The FIFA Legal Handbook contains several regulations, each with specific edition years. Here are  │
│ some of the notable titles and their respective edition years:                                                  │
│                                                                                                                 │
│ 1. **FIFA Anti-Doping Regulations** - 2019 Edition                                                              │
│ 2. **FIFA Match Agents Regulations** - 2003 Edition                                                             │
│ 3. **FIFA Disciplinary Code** - (Other editions not specified, but typically updated around every few years)    │
│ 4. **Regulations Governing the Transfer of Players** - (Edition varies, often updated)                          │
│ 5. **FIFA Club Licensing Regulations** - (Edition varies, often updated)                                        │
│                                                                                                                 │
│ The specific regulations can have multiple amendments and updates over different editions, and the exact        │
│ edition year may vary for the regulations not explicitly listed. If you need more detailed information or       │
│ specific articles from these regulations, please let me know!                                                   │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

As you can see, for this query focusing on FIFA regulations within the Legal Handbook, the agent correctly invoked the Azure AI Search tool and provided a response based on the retrieved information from your private knowledge base.

Example 2: Recent FIFA Controversies (Expected: Bing Search)

# I expect the agent to use Bing Search for this query, let's see!
run_basic_agent("What are the most recent controversies involving FIFA's Football Agent Regulations, from the last 3 months?")

─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Tool Invoked: Bing Search                                                                                       │
│ Query: recent controversies FIFA Football Agent Regulations last 3 months                                       │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭──────────────────────────────────────── Function Call & Final Response ─────────────────────────────────────────╮
│ Function Called: search_bing                                                                                    │
│                                                                                                                 │
│ Final Answer: In the last three months, notable controversies have emerged regarding FIFA's Football Agent      │
│ Regulations (FFAR), which officially came into effect on October 1, 2023. Here are the key points of these      │
│ controversies:                                                                                                  │
│                                                                                                                 │
│ 1. **Court of Arbitration for Sport (CAS) Ruling**: In late July 2023, CAS ruled in favor of FIFA amid appeals  │
│ from football agents challenging the legality of the new FFAR. This ruling reinforced FIFA's authority over the │
│ new regulations but did not quell the dissent among agents.                                                     │
│                                                                                                                 │
│ 2. **Legal Challenges and Suspension**: Following a series of legal challenges by agents and agents’            │
│ associations in various jurisdictions, FIFA announced the partial suspension of these regulations on December   │
│ 30, 2023. The suspension was implemented pending a decision from the European Court of Justice after multiple   │
│ injunctions were filed against the enforcement of certain rules.                                                │
│                                                                                                                 │
│ 3. **Key Disputed Elements**: The FFAR introduced controversial provisions, including caps on commissions and   │
│ limits on agents representing multiple players in the same transaction. These aspects have drawn significant    │
│ criticism from agents, leading to widespread legal opposition.                                                  │
│                                                                                                                 │
│ 4. **Implementation Uncertainty**: Despite the regulations theoretically aiming to standardize the profession   │
│ of football agents globally, their practical implementation has faced halts and skepticism, with national       │
│ federations, such as the English FA, grappling with compliance and adaptation to the changing rules.            │
│                                                                                                                 │
│ The landscape remains uncertain as FIFA must navigate ongoing legal disputes while seeking to re-establish      │
│ regulations that address both regulatory objectives and the interests of football agents.                       │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

For this query about recent controversies, the agent correctly chose the Bing Search tool, recognizing that this type of information is best sourced from the live web.

Benefits of the Agentic RAG Router (Single-Step)

The Single-Step Agentic RAG Router, as demonstrated, offers several key advantages:

Simplicity and Centralized Control: Its straightforward architecture with a single routing agent is easier to understand and manage, making it ideal for simpler applications with well-defined knowledge domains.
Efficiency for Targeted Queries: For use cases where queries clearly fall into distinct categories (e.g., private vs. public knowledge), the single-step router efficiently directs queries to the appropriate tool, minimizing unnecessary searches.
Foundation for More Complex Architectures: Understanding the Single-Step Router is crucial for grasping more advanced, multi-agent routing architectures that we will explore in future blog posts.

Conclusion: Start Building Smarter Agents Today!

The Agentic RAG Router (Single-Step) architecture provides a powerful yet accessible entry point into building intelligent AI agents. By leveraging Azure OpenAI's function calling and Azure AI Search, you can create agents that intelligently route queries, access the most relevant information, and deliver accurate and contextually rich responses to your users.

This is just the beginning of our "Agentic RAG Architectures" series. In upcoming posts, we'll dive into more sophisticated architectures like Query Planning Agentic RAG, Adaptive RAG, and more, showcasing how you can further enhance the intelligence and capabilities of your AI agents using Azure AI services.

Stay tuned for our next post, where we'll explore how to build an Agentic RAG Router (Multi-Step) system to tackle more complex, multi-step queries!

Mastering Agentic RAG Architectures: Single-Step Routing (Part 1)

Table of contents

Introduction: The Rise of Agentic RAG and Intelligent Routing

Agentic RAG Router: Directing Queries for Optimal Answers

Hands-on with Agentic RAG Router (Single-Step) using Azure OpenAI and Azure AI Search

Prerequisites:

Step 1: Setup and Imports

Step 2: Define Search Functions (Tools for our Agent)

Step 3: Define Function Schemas for OpenAI Function Calling

Step 4: Agent Logic and Orchestration with OpenAI Function Calling

Step 5: Testing the Agentic RAG Router

Example 1: FIFA Regulations Query (Expected: Azure AI Search)

Example 2: Recent FIFA Controversies (Expected: Bing Search)

Benefits of the Agentic RAG Router (Single-Step)

Conclusion: Start Building Smarter Agents Today!

Mastering Agentic RAG Architectures: Single-Step Routing (Part 1)

Table of contents

Introduction: The Rise of Agentic RAG and Intelligent Routing

Agentic RAG Router: Directing Queries for Optimal Answers

Hands-on with Agentic RAG Router (Single-Step) using Azure OpenAI and Azure AI Search

Prerequisites:

Step 1: Setup and Imports

Step 2: Define Search Functions (Tools for our Agent)

Step 3: Define Function Schemas for OpenAI Function Calling

Step 4: Agent Logic and Orchestration with OpenAI Function Calling

Step 5: Testing the Agentic RAG Router

Example 1: FIFA Regulations Query (Expected: Azure AI Search)

Example 2: Recent FIFA Controversies (Expected: Bing Search)

Benefits of the Agentic RAG Router (Single-Step)

Conclusion: Start Building Smarter Agents Today!

Did you find this article valuable?