CREATE VECTOR INDEX (Transact-SQL)

2025-05-28

SQL Server 2025 (17.x) Preview

Create an approximate index on a vector column to improve performances of nearest neighbors search. To learn more about how vector indexing and vector search works, and the differences between exact and approximate search, refer to Vectors in the SQL Database Engine.

Preview feature

Note

This feature in preview and is subject to change. Make sure to read preview usage terms in Service Level Agreements (SLA) for Online Services.

This feature is in preview. In order to use this feature you must enable the following trace flags:

DBCC TRACEON(466, 474, 13981, -1)

Make sure to check out the current limitations before using it.

Syntax

Transact-SQL syntax conventions

CREATE VECTOR INDEX index_name
ON object ( vector_column )  
[ WITH (
    [,] METRIC = { 'cosine' | 'dot' | 'euclidean' }
    [ [,] TYPE = 'DiskANN' ]
    [ [,] MAXDOP = max_degree_of_parallelism ]
    [ [,] DROP_EXISTING = { ON | OFF } ]
) ]
[ ON { filegroup_name | "default" } ]
[;]

Arguments

index_name

The name of the index. Index names must be unique within a table but don't have to be unique within a database. Index names must follow the rules of identifiers.

object

Table on which the index is created. It must be a base table. Views, temporary tables, both local and global, aren't supported.

vector_column

Column to use to create the vector index. It must be of vector type.

METRIC = { 'cosine' | 'dot' | 'euclidean' }

A string with the name of the distance metric to use to calculate the distance between the two given vectors. The following distance metrics are supported:

cosine - Cosine distance
euclidean - Euclidean distance
dot - (Negative) Dot product

TYPE = 'DiskANN'

The type of ANN algorithm used to build the index. Only DiskANN is currently supported. DiskANN is the default value.

MAXDOP = max_degree_of_parallelism

Overrides the max degree of parallelism configuration option for the index operation. For more information, see Max degree of parallelism. Use MAXDOP to limit the degree of parallelism and the resulting resource consumption for an index build operation.

max_degree_of_parallelism can be:

1

Suppresses parallel plan generation.
>1

Restricts the maximum degree of parallelism used in a parallel index operation to the specified number or less based on the current system workload.
0 (default)

Uses the degree of parallelism specified at the server, database, or workload group level, unless reduced based on the current system workload.

For more information, see Configure Parallel Index Operations.

Note

Parallel index operations aren't available in every edition of Microsoft SQL Server. For a list of features that are supported by the editions of SQL Server, see Editions and supported features of SQL Server 2022 or Editions and supported features of SQL Server 2025 Preview.

DROP_EXISTING = { ON | OFF }

Is an option to drop and rebuild the existing vector index with modified specifications, and keep the same name for the index. The default is OFF.

ON

Specifies to drop and rebuild the existing index, which must have the same name as the parameter index_name.
OFF

Specifies not to drop and rebuild the existing index. SQL Server displays an error if the specified index name already exists.

Limitations

The current preview has the following limitations:

No partition support. Vector index can't be partitioned.
The table must have a single integer non-nullable column clustered index.
During and for all the time needed for vector index creation to complete, an SCH-M lock is acquired on the table. As a result, the lock prevents any access to the table or its metadata.
Once a vector index is created on a table, the table becomes read-only. No data modification is allowed while the vector index is present on the table.
Vector index are not replicated to subscribers.

Permissions

The user must have ALTER permission on the table.

Examples

Details of the database used in the sample can be found here: Download and import the Wikipedia Article with Vector Embeddings.

Examples assume the existence of a table named wikipedia_articles with a column title_vector of type vector that stores title's embeddings of Wikipedia articles. title_vector is assumed to be an embedding generated with an embedding model like text-embedding-ada-002 or text-embedding-3-small , which returns vectors with 1,536 dimensions.

For more examples, including end-to-end solutions, go to the Azure SQL Database Vector Search Samples GitHub repo.

Example 1

The following example creates a vector index on the title_vector column using the cosine metric.

CREATE VECTOR INDEX vec_idx ON [dbo].[wikipedia_articles]([title_vector]) 
WITH (METRIC = 'cosine', TYPE = 'diskann');

Example 2

The following example creates a vector index on the title_vector column using the (negative) dot product metric, limiting the parallelism to 8 and storing the vector in the SECONDARY filegroup.

CREATE VECTOR INDEX vec_idx ON [dbo].[wikipedia_articles]([title_vector]) 
WITH (METRIC = 'cosine', TYPE = 'diskann', MAXDOP = 8)
ON [SECONDARY]

Example 3

A basic end-to-end example using CREATE VECTOR INDEX and the related VECTOR_SEARCH function. The embeddings are mocked. In a real world scenario, embeddings are generated using an embedding model and AI_GENERATE_EMBEDDINGS, or an external library such as OpenAI SDK.

The following code block creates mock embeddings with the following steps:

Enables the trace flag, necessary in the current preview.
Create a sample table dbo.Articles with a column embedding with data type vector(5).
Insert sample data with mock embedding data.
Create a vector index on dbo.Articles.embedding.
Demonstrate the vector similarity search with the VECTOR_SEARCH() function.

-- Step 0: Enable Preview Feature
DBCC TRACEON(466, 474, 13981, -1);
GO

-- Step 1: Create a sample table with a VECTOR(5) column
CREATE TABLE dbo.Articles 
(
    id INT PRIMARY KEY,
    title NVARCHAR(100),
    content NVARCHAR(MAX),
    embedding VECTOR(5) -- mocked embeddings
);

-- Step 2: Insert sample data
INSERT INTO Articles (id, title, content, embedding)
VALUES
(1, 'Intro to AI', 'This article introduces AI concepts.', '[0.1, 0.2, 0.3, 0.4, 0.5]'),
(2, 'Deep Learning', 'Deep learning is a subset of ML.', '[0.2, 0.1, 0.4, 0.3, 0.6]'),
(3, 'Neural Networks', 'Neural networks are powerful models.', '[0.3, 0.3, 0.2, 0.5, 0.1]'),
(4, 'Machine Learning Basics', 'ML basics for beginners.', '[0.4, 0.5, 0.1, 0.2, 0.3]'),
(5, 'Advanced AI', 'Exploring advanced AI techniques.', '[0.5, 0.4, 0.6, 0.1, 0.2]');

-- Step 3: Create a vector index on the embedding column
CREATE VECTOR INDEX vec_idx ON Articles(embedding)
WITH (metric = 'cosine', type = 'diskann');

-- Step 4: Perform a vector similarity search
DECLARE @qv VECTOR(5) = '[0.3, 0.3, 0.3, 0.3, 0.3]';
SELECT
    t.id,
    t.title,
    t.content,
    s.distance
FROM
    VECTOR_SEARCH(
        table = Articles AS t,
        column = embedding,
        similar_to = @qv,
        metric = 'cosine',
        top_n = 3
    ) AS s
ORDER BY s.distance, t.title;

Share via