Mission Statement
The goal of this project was to develop an internal chatbot powered by a client’s own company data. Many AI models store copies of the data they process, which poses serious risks for organisations such as potential data breaches, loss of trust, and even legal consequences. To address this, our objective was to build a chatbot that could securely access and use company information without ever storing a copy externally. This allows employees to query the bot internally, for example, about company policies, while ensuring complete data security and compliance.
Tools Used
Source Systems
Alteryx + Auto Insights - Cleaning, Joining, Transforming, Output.
AI Models e.g ChatGPT, Azure AI Foundry, Microsoft Copilot
PowerBI- Visualisation
Detailed Solution
Data Ingestion
Company information is gathered from multiple structured and unstructured repositories, including:
Documents (Word, PDF, Excel, etc.)
SAP systems
Finance applications
SharePoint sites
Data Preparation (Alteryx)
Cleaning: Remove inconsistencies, irrelevant fields, and formatting issues.
Deduplication: Identify and eliminate duplicate records to ensure data accuracy.
Enrichment: Augment data with additional context (e.g., metadata, tags, business classifications).
Unification: Standardize across sources into a common data model for consistency.
Data Storage
SQL Database: For structured datasets.
Azure Blob Storage: For scalable, cost-effective unstructured data storage.
Foundry Dataset: For seamless integration into the client’s data ecosystem.
AI Indexing
Vector Database: Transform and index data embeddings for semantic search and retrieval.
Pinecone, FAISS, or Foundry’s native vector services.
Query Layer
LLM Integration:
Azure OpenAI Service: Secure and scalable generative AI.
Copilot Studio: For building conversational interfaces.
Foundry LLM Integration: Tight coupling with enterprise data governance and compliance.
Output & Consumption
Delivery Channels:
Microsoft Teams: Chat-based Q&A for employees.
Dashboards: Embedded insights into BI platforms.
Copilots: Tailored digital assistants for specific business functions.
Impact
Business-ready, actionable answers directly within existing workflows.
Enable employees to ask natural language questions against the indexed company data.
Fast and context-aware retrieval of information during queries.
Single source of truth for enterprise-ready, cleaned data.