# Implementing Retrieval-Augmented Generation (RAG) in Azure AI Foundry w/ Microsoft Fabric as a Data Source

The RAG tooling ecosystem has exploded-LangChain, LlamaIndex etc and dozens of specialized vector databases each solve pieces of the puzzle beautifully. But enterprise implementations aren't just about technical capabilities; they're about reducing complexity, maintaining security boundaries, and scaling reliably. When your organization has already invested in the Microsoft stack, Azure AI Foundry with Fabric offers something beyond technical merit. It offers integration depth that external tools simply cannot match.

*Here’s what we’re setting out to achieve in final phase…*

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755595571288/a08daecc-029e-4d4c-924d-4e885914272a.png align="center")

## Understanding RAG

If you've been keeping an eye on AI trends, you've probably heard of **RAG** ; short for **Retrieval-Augmented Generation**. It's a method for giving large language models (LLMs) access to your own data, so they can answer questions with **accurate, up-to-date, and context-rich information** instead of relying only on what they were trained on.

## How RAG Works

RAG creates a powerful synergy between your data and LLM capabilities. When a user submits a question, the system searches your data repository to find relevant information. The user question is then combined with matching results and sent to the LLM through a carefully crafted prompt. The LLM uses both the question and retrieved context to generate comprehensive, data-driven answers.

[![Image Courtesy : Microsoft](https://learn.microsoft.com/en-us/azure/ai-foundry/media/index-retrieve/rag-pattern.png align="left")](https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview?tabs=docs)

## The Core Components: Embeddings, Indexes, and Vector Databases

**Embeddings** are numerical representations of text created by specialized AI models. These models convert words, sentences, or documents into high-dimensional vectors (arrays of numbers) that capture semantic meaning. Similar concepts produce similar vectors "car" and "automobile" become nearly identical number sequences despite being different words.

**Indexes** serve as the foundation for efficient data retrieval. They're specialized data structures that enable fast, accurate searches across your information repository. RAG indexes combine multiple search methods:

* **Keyword searches** for exact term matching
    
* **Semantic searches** for conceptual similarity
    
* **Vector searches** for nuanced content relationships
    
* **Hybrid approaches** that merge these capabilities
    

**Vector Databases** are storage systems optimized specifically for managing and searching embeddings. Unlike traditional databases that rely on exact matches, vector databases excel at similarity searches across millions of vectors in milliseconds.

## Typical RAG Workflow

1. **Document Processing**: Your documents are broken into chunks and converted into embeddings
    
2. **Storage**: These embeddings are stored in a vector database (your searchable index)
    
3. **Query Processing**: User questions are converted into embeddings using the same model
    
4. **Retrieval**: The vector database finds the most <mark>semantically</mark> similar content
    
5. **Generation**: Retrieved context plus the original question are sent to the LLM for accurate, contextual answers(Most often using a different model )
    

This approach ensures responses are grounded in your actual data while leveraging the LLM's natural language capabilities.

**RAG patterns that include Azure AI Search**

Azure AI Search provides scalable search infrastructure that indexes diverse content types and enables retrieval through APIs, applications, and AI agents. The platform offers native integrations with Azure's AI ecosystem including OpenAI services, AI Foundry, and Machine Learning while supporting extensible architectures for third-party and open-source model integration.

[![Architecture diagram of information retrieval with search and ChatGPT.](https://learn.microsoft.com/en-us/azure/search/media/retrieval-augmented-generation-overview/architecture-diagram.png align="left")](https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview?tabs=docs)

![RAG](https://microsoft.github.io/aitour-build-a-copilot-on-azure-ai/img/rag-design-pattern.png align="left")

Let’s dive into practical implementation from here onwards.This would include both UI based approach and code first approach.

> Note : I have omitted minor or commonly known details, focusing instead on explaining and elaborating the RAG pattern and it’s core setup and flow.

## Approach 1 : Low / No-Code Solution

1. Create an Azure OpenAI resource  
    The initial step is to create an OpenAI resource in Azure, as illustrated below.
    
    ![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755518102864/58061e13-29c0-46bd-b5ac-9772dd27b4c6.png align="center")
    
2. Navigate to the Azure AI Foundry portal and create an **embedding** deployment. You may select any base model of your choice; in this example, the ‘*text-ada’* model has been selected.(text-embedding-ada-002 is part of gpt-3 model family.)
    
    ![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755497554056/fdb59f9e-d9d1-43ea-b4a6-b8d249b772d0.png align="center")
    
3. Jump back to Azure portal and create an **Azure AI Search** Resource (When allocating AI search mindful of the cost of the resource utilization)
    
    ![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755532489587/35baba0e-2efd-4c17-b50a-28190622e8ee.png align="center")
    
4. Connect the document folder within the Fabric ‘Lakehouse’ files area and vectorize the data.(Once we navigate to the created AI search resource click as below.
    
    ![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755498288243/f1faab74-185b-487d-beea-652a55c8f4c2.png align="center")
    
5. Select Fabric OneLake files as source for the vectorization.Since the document I am using is purely text based document ( I used [**Microsoft Whitepaper on AI governance**](https://adoption.microsoft.com/files/copilot-studio/Agent-governance-whitepaper.pdf) as my source file which consists 32 pages) and this was uploaded to Fabric Lakehouse files section by creating a folder
    
    ![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755597751964/f6ad83de-d645-4078-ac89-f8330b99be94.png align="center")
    
    ![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755498327964/34ed8285-3959-4921-a222-079228c23539.png align="center")
    
6. Select **‘RAG’**
    
    ![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755498405914/1130125f-3b5b-4370-9046-ab0e39e3163a.png align="center")
    
7. Next, specify the Fabric ‘Lakehouse’ URL along with the folder path where the files are located.
    
    ![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755518501502/18b4ca94-8347-4aa8-a5de-98f4765abaa1.png align="center")
    
8. After completing these steps, navigate to the **‘*Indexes’*** section in the **Azure AI Search** portal. You will notice that an index is created within a short period of time.
    
    ![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755498707517/1a0dd6b1-77d0-4440-8f33-39e37d762ffe.png align="center")
    
9. Now we can create a new **agent** in Azure AI foundry and attach the **created Azure AI search** resource as a knowledge tool and play with the data in **agent playground.**
    
    ![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755499887646/73dab9dc-2bc2-49ef-8890-9967a64e1f24.png align="center")
    

**For the search type I have selected Hybrid search (Vector+Keyword)**

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755596100309/1934494c-34d5-4829-add5-21f551e1717d.png align="center")

**Finally Try in agent playground 🏐**

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755596040250/a31a2d4c-79bc-4cfe-aac5-b452f655a80d.png align="center")

## Showcase **\- \[ 01 \] with Agents playground UI**

It can be observed that the reference has been applied correctly, ensuring that no external sources were used as knowledge.(**“AI Governance whitepaper.pdf”**)

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755500187101/046599c2-8527-4dd1-a384-13c8ef387dae.png align="center")

> Tool calling succeeded, as confirmed within the thread status.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755595928845/a892c670-22b4-4d80-b009-005b715d9c47.png align="center")

<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Up to this point, we have explored how to create a RAG pattern using a low-code/no-code solution. We will now proceed with the code-first approach, which offers more flexibility and customization options for developers who want fine-grained control over their RAG implementation.</div>
</div>

---

## Approach 2: Code-First Implementation

### Azure AI Foundry SDK / AI Projects client library

We'll primarily use the <mark>AI Projects client library </mark> is part of the <mark>Azure AI Foundry SDK</mark>, and provides easy access to resources in your Azure AI Foundry Project. ( The **'**[**azure.ai**](http://azure.ai)**.agents.models'** package is a sub-module under this client library that handles agent-related operations.)

<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Following Microsoft’s Build announcement in April 2025, numerous enhancements were introduced to the <strong>Azure AI Foundry “Projects” library.</strong> If you're using earlier versions, it's highly recommended to upgrade to the latest library . Also note I am using new foundry based project path and not hub based project (legacy) path.</div>
</div>

**Install the libraries** - (Note that the dependent package [azure-ai-agents](https://pypi.org/project/azure-ai-agents/) [will be install](https://pypi.org/project/azure-ai-agents/) as a result, if not already installed, to support `.agent` operations on the client.)

```python
!pip install --upgrade azure-ai-projects azure-identity load_dotenv
!python -m pip install --upgrade pip
```

This code loads configuration values (like client ID, secret, tenant ID, and endpoint) from an external `.env` file (`api_settings.env`) into Python variables.

```python
import os
from dotenv import load_dotenv
load_dotenv('api_settings.env')
CLIENT_ID = os.getenv("CLIENT_ID")
CLIENT_SECRET = os.getenv("CLIENT_SECRET")
TENANT_ID = os.getenv("TENANT_ID")
PROJECT_ENDPOINT = os.getenv("PROJECT_ENDPOINT")
```

The script below demonstrates the following steps.

* **Authentication and Setup** – The script retrieves Azure credentials (Tenant ID, Client ID, Client Secret) from environment variables and authenticates using `ClientSecretCredential`. It then initializes the `AIProjectClient` with the given project endpoint.
    

> When using the Azure AI Projects library with Entra ID authentication, we must authenticate via a service principal or app registration. To enable this, assign the **Azure AI User** role (built-in Azure RBAC role) on the Azure AI Foundry resource to your identity. This ensures your service principal has permission to access AI projects using Microsoft Entra ID.

* **Connection Discovery** – It queries the list of configured project connections and identifies the **Azure AI Search** resource by inspecting connection metadata, extracting the corresponding `connection_id`.
    
* **Tool Initialization** – An `AzureAISearchTool` instance is created using the connection ID and a predefined index name, enabling integration of search capabilities into the agent.
    
* **Agent Creation** – A new agent is provisioned with the GPT-4.1-mini model, custom instructions for Retrieval-Augmented Generation (RAG) queries, and the AI Search tool attached as a resource.
    
* **Thread Creation** – A dedicated conversation thread is established to maintain the dialogue state with the agent.
    
* **Interactive Execution Loop** –
    
    * User inputs are continuously read from the console.
        
    * Inputs are posted as messages to the agent.
        
    * The agent is executed (`create_and_process`), responses are retrieved, and results are displayed.
        
    * The loop terminates when the user enters `"end"`, after which the agent is deleted for cleanup.
        

```python
import os
from azure.ai.projects import AIProjectClient
from azure.identity import ClientSecretCredential
from azure.ai.agents.models import AzureAISearchTool

# Entry point for the script
if __name__ == "__main__":
    # Load Azure credentials from environment variables
    credential = ClientSecretCredential(
        tenant_id=os.getenv("TENANT_ID"),
        client_id=os.getenv("CLIENT_ID"),
        client_secret=os.getenv("CLIENT_SECRET")
    )

    # Create a client instance for interacting with the Azure AI Project
    project_client = AIProjectClient(
        credential=credential,
        endpoint=os.environ["PROJECT_ENDPOINT"]  # Project endpoint reference
    )

    # Retrieve available project connections to locate the AI Search resource
    conn_list = project_client.connections.list()
    conn_id = ""

    # Identify the Azure AI Search connection by checking metadata fields
    # Assumes only one AI Search connection is configured in the project
    for conn in conn_list:
        properties = conn.get("properties", {})
        metadata = properties.get("metadata", {})
        if metadata.get("type", "").upper() == "AZURE AI SEARCH":
            conn_id = conn["id"]
            break

    print(conn_id)

    # Reassign the valid connection ID for AI Search
    conn_id = conn["id"]

    # Configure the Azure AI Search tool with the discovered connection and index name
    ai_search = AzureAISearchTool(index_connection_id=conn_id, index_name="rag-1754502262882")

    # Provision an AI Agent and attach the AI Search tool for RAG-style queries
    agent = project_client.agents.create_agent(
        model="gpt-4.1-mini",
        name="my-agent-aisearch",
        instructions="You are a helpful agent to perform RAG Queries using AI Search",
        tools=ai_search.definitions,
        tool_resources=ai_search.resources,
    )
    print(f"Created agent, ID: {agent.id}")

    # Initialize a new conversation thread for maintaining dialog state
    thread = project_client.agents.threads.create()
    print(f"Created thread, ID: {thread.id}")

    # Continuous user interaction loop
    while True:
        user_input = input("User: ")
        if user_input.lower() == "end":
            project_client.agents.delete_agent(agent.id)
            print("Ending the conversation.")
            break

        # Post user input as a message to the agent
        message = project_client.agents.messages.create(
            thread_id=thread.id,
            role="user",
            content=user_input,
        )

        # Run the agent and process the response
        run = project_client.agents.runs.create_and_process(thread_id=thread.id, agent_id=agent.id)
        print(f"Run finished with status: {run.status}")

        if run.status == "failed":
            print(f"Run failed: {run.last_error}")
            break

        # Fetch and print the agent’s most recent reply
        messages = project_client.agents.messages.list(thread_id=thread.id)
        for agent_response in messages:
            if agent_response.text_messages:
                last_text = agent_response.text_messages[-1]
                print(f"{agent_response.role}: {last_text.text.value}")   

    # Final cleanup of the agent instance
    project_client.agents.delete_agent(agent.id)
    print("Conversation ended")
```

## Showcase **\- \[ 02 \] with Visual Studio code in locally**

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755543484412/95b72106-55ad-456c-a7f9-9deb638678a3.gif align="center")

## Showcase **\- \[ 03 \] using Streamlit Frontend**

> **Streamlit** is an open-source Python framework that makes it easy to build interactive web apps for data and machine learning projects.

📤I deployed my project on **Streamlit Community Cloud** and adapted the code to work seamlessly with the **Streamlit** frontend. This enables me to access the app from anywhere through a customized, user-friendly interface.

By the way, what key information are you looking to extract from that [Microsoft whitepaper?](https://adoption.microsoft.com/files/copilot-studio/Agent-governance-whitepaper.pdf) 📄🤔

I’ve made it public 🌍, so feel free to give it a try 🚀.

**\[ Sample Q’s you can try out for:**

* What specific governance controls do we need to implement for each type of agent creator (End Users, Makers, and Developers) in our organization?
    
* How can we effectively use Microsoft Purview's Data Loss Prevention policies to prevent our agents from accessing highly confidential SharePoint content?
    
* What's the best approach for setting up Power Platform environments and pipelines to ensure secure agent development and deployment across our teams? etc \]
    

> Link : [https://ragaifoundry.streamlit.app/](https://ragaifoundry.streamlit.app/)  
> If you’re unable to access the working site, it’s probably because my Azure free credits ran out 💳⚡. You can still find all my project assets in the References section below 📂

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1755580601209/6821edc3-1fbc-4dd3-89d2-4c7cfcf4becd.gif align="center")

## Key Takeaways

Building enterprise RAG solutions with Azure AI Foundry reveals several crucial insights: prioritize data quality over complexity, leverage hybrid search for optimal results, and choose between low-code UI approaches for rapid prototyping or code-first methods for customization. Most importantly, start small and scale gradually while continuously monitoring performance.

## Conclusion

Azure AI Foundry transforms enterprise RAG implementation from complex to accessible. Whether you prefer UI-based configuration, programmatic development, or web deployment, the platform provides enterprise-grade security and scalability without sacrificing ease of use.

The three approaches demonstrated ***Agent playground, Azure AI Foundry SDK, and Streamlit web app*** showcase the platform's flexibility for different team needs and technical requirements. As organizations increasingly need AI solutions that integrate seamlessly with existing Microsoft infrastructure, Azure AI Foundry offers both immediate value and long-term strategic advantages.

---

## References

* [Azure AI Search - Retrieval Augmented Generation Overview](https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview?tabs=docs)
    
* [Azure AI Foundry - RAG Concepts](https://learn.microsoft.com/en-us/azure/ai-foundry/concepts/retrieval-augmented-generation)
    
* [Microsoft AI Agents Governance Whitepaper](https://adoption.microsoft.com/files/copilot-studio/Agent-governance-whitepaper.pdf)
    
* [Project Assets](https://github.com/nalakan/AI_Foundry_rag_doc_search?tab=readme-ov-file#rag-based-intelligent-document-search-using-azure-ai-foundrysearch)
    

---

*Thanks for reading! If you found this guide helpful, consider sharing it with others who might benefit from implementing RAG solutions in their organizations with Microsoft AI Foundry.*
