Skip to content

QdrantVectorstore

pip install datapizza-ai-vectorstores-qdrant

datapizza.vectorstores.qdrant.QdrantVectorstore

Bases: Vectorstore

datapizza-ai implementation of a Qdrant vectorstore.

__init__

__init__(host=None, port=6333, api_key=None, **kwargs)

Initialize the QdrantVectorstore.

Parameters:

Name Type Description Default
host str

The host to use for the Qdrant client. Defaults to None.

None
port int

The port to use for the Qdrant client. Defaults to 6333.

6333
api_key str

The API key to use for the Qdrant client. Defaults to None.

None
**kwargs

Additional keyword arguments to pass to the Qdrant client.

{}
a_search(
    collection_name,
    query_vector,
    k=10,
    vector_name=None,
    **kwargs,
)

Search for chunks in a collection by their query vector.

add

add(chunk, collection_name=None)

Add a single chunk or list of chunks to the vectorstore. Args: chunk (Chunk | list[Chunk]): The chunk or list of chunks to add. collection_name (str, optional): The name of the collection to add the chunks to. Defaults to None.

create_collection

create_collection(collection_name, vector_config, **kwargs)

Create a new collection in Qdrant if it doesn't exist with the specified vector configurations

Parameters:

Name Type Description Default
collection_name str

Name of the collection to create

required
vector_config list[VectorConfig]

List of vector configurations specifying dimensions and distance metrics

required
**kwargs

Additional arguments to pass to Qdrant's create_collection

{}

delete_collection

delete_collection(collection_name, **kwargs)

Delete a collection in Qdrant.

dump_collection

dump_collection(
    collection_name, page_size=100, with_vectors=False
)

Dumps all points from a collection in a chunk-wise manner.

Parameters:

Name Type Description Default
collection_name str

Name of the collection to dump.

required
page_size int

Number of points to retrieve per batch.

100
with_vectors bool

Whether to include vectors in the dumped chunks.

False

Yields:

Name Type Description
Chunk Chunk

A chunk object from the collection.

get_collections

get_collections()

Get all collections in Qdrant.

remove

remove(collection_name, ids, **kwargs)

Remove chunks from a collection by their IDs. Args: collection_name (str): The name of the collection to remove the chunks from. ids (list[str]): The IDs of the chunks to remove. **kwargs: Additional keyword arguments to pass to the Qdrant client.

retrieve

retrieve(collection_name, ids, **kwargs)

Retrieve chunks from a collection by their IDs. Args: collection_name (str): The name of the collection to retrieve the chunks from. ids (list[str]): The IDs of the chunks to retrieve. **kwargs: Additional keyword arguments to pass to the Qdrant client. Returns: list[Chunk]: The list of chunks retrieved from the collection.

search

search(
    collection_name,
    query_vector,
    k=10,
    vector_name=None,
    **kwargs,
)

Search for chunks in a collection by their query vector.

Parameters:

Name Type Description Default
collection_name str

The name of the collection to search in.

required
query_vector list[float]

The query vector to search for.

required
k int

The number of results to return. Defaults to 10.

10
vector_name str

The name of the vector to search for. Defaults to None.

None
**kwargs

Additional keyword arguments to pass to the Qdrant client.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: The list of chunks found in the collection.

Usage

from datapizza.vectorstores.qdrant import QdrantVectorstore

# Connect to Qdrant server
vectorstore = QdrantVectorstore(
    host="localhost",
    port=6333,
    api_key="your-api-key"  # Optional
)

# Or use in-memory/file storage
vectorstore = QdrantVectorstore(
    location=":memory:"  # Or path to file
)

Features

  • Connect to Qdrant server or use local storage
  • Support for both dense and sparse embeddings
  • Named vector configurations for multi-vector collections
  • Batch operations for efficient processing
  • Collection management (create, delete, list)
  • Chunk-based operations with metadata preservation
  • Async support for all operations
  • Point-level operations (add, update, remove, retrieve)

Examples

Basic Setup and Collection Creation

from datapizza.core.vectorstore import Distance, VectorConfig
from datapizza.type import EmbeddingFormat
from datapizza.vectorstores.qdrant import QdrantVectorstore

vectorstore = QdrantVectorstore(location=":memory:")

# Create collection with vector configuration
vector_config = [
    VectorConfig(
        name="text_embeddings",
        dimensions=3,
        format=EmbeddingFormat.DENSE,
        distance=Distance.COSINE
    )
]

vectorstore.create_collection(
    collection_name="documents",
    vector_config=vector_config
)

# Add nodes and search

import uuid
from datapizza.type import Chunk, DenseEmbedding
from datapizza.vectorstores.qdrant import QdrantVectorstore

# Create chunks with embeddings
chunks = [
    Chunk(
        id=str(uuid.uuid4()),
        text="First document content",
        metadata={"source": "doc1.txt"},
        embeddings=[DenseEmbedding(name="text_embeddings", vector=[0.1, 0.2, 0.3])]
    ),
    Chunk(
        id=str(uuid.uuid4()),
        text="Second document content",
        metadata={"source": "doc2.txt"},
        embeddings=[DenseEmbedding(name="text_embeddings", vector=[0.4, 0.5, 0.6])]
    )
]

# Add chunks to collection
vectorstore.add(chunks, collection_name="documents")

# Search for similar chunks
query_vector = [0.1, 0.2, 0.3]
results = vectorstore.search(
    collection_name="documents",
    query_vector=query_vector,
    k=5
)

for chunk in results:
    print(f"Text: {chunk.text}")
    print(f"Metadata: {chunk.metadata}")