SqlDatabaseVectorSearch
A Blazor Web App and Minimal API for performing RAG (Retrieval Augmented Generation) and vector search using the native VECTOR type in Azure SQL Database and Azure OpenAI.
Install / Use
/learn @marcominerva/SqlDatabaseVectorSearchREADME
SQL Database Vector Search Sample
A Blazor Web App and Minimal API for performing RAG (Retrieval Augmented Generation) and vector search using the native VECTOR type in Azure SQL Database and Azure OpenAI.
Table of Contents
- Overview
- Screenshots
- Prerequisites
- Project Structure
- Setup
- Supported Features
- How to Use
- Limitations & FAQ
- Contributing
- License
Overview
This application allows you to:
- Load documents (PDF, DOCX, TXT, MD)
- Generate embeddings and save them as vectors in Azure SQL Database
- Perform semantic search and RAG using Azure OpenAI
- Interact via a Blazor Web App or programmatically via Minimal API
Embeddings and chat completion are powered by Semantic Kernel.
Screenshots
Web App

Web API

Prerequisites
- .NET 10 SDK
- Azure SQL Database
- Azure OpenAI resource and API keys
Project Structure
SqlDatabaseVectorSearch/- Main Blazor Web App and APIComponents/- Blazor UI componentsData/- EF Core context, migrations, and entitiesEndpoints/- Minimal API endpointsServices/- Business logic and integration servicesTextChunkers/- Text splitting utilitiesSettings/- Configuration classes
Setup
-
Clone the repository
git clone https://github.com/marcominerva/SqlDatabaseVectorSearch.git -
Configure the database and OpenAI settings
- Edit
SqlDatabaseVectorSearch/appsettings.jsonand set your Azure SQL connection string and OpenAI settings. - Important: The
ModelIdvalues for bothChatCompletionandEmbeddingare used for token counting viaMicrosoft.ML.Tokenizers. These values must be valid model identifiers supported by the tokenizer library (e.g.,gpt-4o,gpt-4,gpt-3.5-turbo,text-embedding-3-small,text-embedding-3-large,text-embedding-ada-002). TheModelIdmay differ from the actual deployment name you're using in Azure OpenAI. For example, for gpt-4.1 and gpt-5 models set theModelIdtogpt-4ofor proper token counting. - If using embedding models with shortening (e.g.,
text-embedding-3-smallortext-embedding-3-large), set theDimensionsproperty accordingly. Fortext-embedding-3-large, you must specify a value <= 1998. - If you change the VECTOR size, update both the ApplicationDbContext and the Initial Migration.
- Edit
-
Run the application
dotnet run --project SqlDatabaseVectorSearch/SqlDatabaseVectorSearch.csproj -
Access the Web App
- Navigate to
https://localhost:5001(or the port shown in the console)
- Navigate to
Supported features
- Conversation History with Question Reformulation: This feature allows users to view the history of their conversations, including the ability to reformulate questions for better clarity and understanding. This ensures that users can track their interactions and refine their queries as needed.
- Information about Token Usage: Users can access detailed information about token usage, which helps in understanding the consumption of tokens during interactions. This feature provides transparency and helps users manage their token usage effectively.
- Response Streaming: This feature enables real-time streaming of responses, allowing users to receive information as it is being processed. This ensures a seamless and efficient flow of information, enhancing the overall user experience.
- Citations: The application provides citations for the sources used to justify each answer. This allows users to verify the information and understand the origin of the content provided by the system.
How to Use
- Web App: Use the Blazor interface to upload documents, search, and chat with RAG.
- API: Import documents via
POST /api/documentsand ask questions viaPOST /api/askorPOST /api/ask-streaming.
Example API Request
POST /api/ask
Content-Type: application/json
{
"conversationId": "3d0bd178-499d-433a-b2bc-c35e488d9e2c"
"text": "Why is Mars called the red planet?"
}
Example API Response
{
"originalQuestion": "why is mars called the red planet?",
"reformulatedQuestion": "Why is the planet Mars called the red planet?",
"answer": "Mars is called the Red Planet because its surface has an orange-red color due to being covered in iron(III) oxide dust, also known as rust. This iron oxide gives Mars its distinctive reddish appearance when observed from Earth and is the origin of its well-known nickname",
"streamState": "End",
"tokenUsage": {
"reformulation": {
"promptTokens": 812,
"completionTokens": 11,
"totalTokens": 823
},
"embeddingTokenCount": 10,
"question": {
"promptTokens": 31708,
"completionTokens": 227,
"totalTokens": 31935
}
},
"citations": [
{
"documentId": "b1870ad7-4685-42a3-576a-08ddb01159d5",
"chunkId": "749aba1e-0db5-4033-cfa6-08ddb0115da3",
"fileName": "Mars.pdf",
"quote": "surface of Mars is orange-red because it is covered in iron(III) oxide",
"pageNumber": 1,
"indexOnPage": 0
},
{
"documentId": "b1870ad7-4685-42a3-576a-08ddb01159d5",
"chunkId": "215e7197-513f-4fbe-cfa8-08ddb0115da3",
"fileName": "Mars.pdf",
"quote": "Martian surface is caused by ferric oxide, or rust",
"pageNumber": 3,
"indexOnPage": 0
}
]
}
How response streaming works
When using the /api/ask-streaming endpoint, answers will be streamed as with the typical response from OpenAI. The format of the response is as follows:
[
{
"originalQuestion": "why is mars called the red planet?",
"reformulatedQuestion": "Why is the planet Mars known as the red planet?",
"answer": null,
"streamState": "Start",
"tokenUsage": {
"reformulation": {
"promptTokens": 541,
"completionTokens": 12,
"totalTokens": 553
},
"embeddingTokenCount": 11,
"question": null
},
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": "Mars",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " is",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " known",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " as",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " the",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " red",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " planet",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " because",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " its",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " surface",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " is",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " covered",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " in",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": " iron",
"streamState": "Append",
"tokenUsage": null,
"citations": null
},
/// ...
{
"originalQuestion": null,
"reformulatedQuestion": null,
"answer": null,
"streamState": "End",
"tokenUsage": {
"reformulation": null,
"embeddingTokenCount": null,
"question": {
"promptTokens": 30949,
"completionTokens": 221,
"totalTokens": 31170
}
},
"citations": [
{
"documentId": "b1870ad7-4685-42a3-576a-08ddb01159d5",
"chunkId": "
Related Skills
feishu-drive
339.3k|
things-mac
339.3kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
clawhub
339.3kUse the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com
yu-ai-agent
2.0k编程导航 2025 年 AI 开发实战新项目,基于 Spring Boot 3 + Java 21 + Spring AI 构建 AI 恋爱大师应用和 ReAct 模式自主规划智能体YuManus,覆盖 AI 大模型接入、Spring AI 核心特性、Prompt 工程和优化、RAG 检索增强、向量数据库、Tool Calling 工具调用、MCP 模型上下文协议、AI Agent 开发(Manas Java 实现)、Cursor AI 工具等核心知识。用一套教程将程序员必知必会的 AI 技术一网打尽,帮你成为 AI 时代企业的香饽饽,给你的简历和求职大幅增加竞争力。
