ImageRAGPrompt

Specialized prompting utilities for Retrieval-Augmented Generation (RAG) with image content. The ImageRAGPrompt class provides multimodal content integration for vision-language models.

datapizza.modules.prompt.ImageRAGPrompt

Bases: Prompt

Create a memory for a image RAG system.

init

__init__(
    user_prompt_template,
    image_prompt_presentation,
    each_image_template,
)

Parameters:

Name	Type	Description	Default
`user_prompt_template`	`str`	str # The user prompt jinja template	required
`image_prompt_presentation`	`str`	str # The image prompt jinja template	required
`each_image_template`	`str`	str # The each image jinja template	required

format

format(chunks, user_query, retrieval_query, memory=None)

Creates a new memory object that includes: - Existing memory messages - User's query - Function call retrieval results - Chunks retrieval results

Parameters:

Name	Type	Description	Default
`chunks`	`list[Chunk]`	The chunks to add to the memory.	required
`user_query`	`str`	The user's query.	required
`retrieval_query`	`str`	The query to search the vectorstore for.	required
`memory`	`Memory \| None`	The memory object to add the new messages to.	`None`

Returns:

Name	Type	Description
`memory`	`Memory`	A new memory object with the new messages.

Overview

from datapizza.modules.prompt.image_rag import ImageRAGPrompt

# Initialize image RAG prompt handler
image_rag = ImageRAGPrompt()

Features:

Image-aware RAG prompt construction
Multimodal content integration
Context preservation for image-text interactions
Optimized prompting for vision-language models

Usage Examples

Basic Image RAG Usage

from datapizza.modules.prompt.image_rag import ImageRAGPrompt
from datapizza.type import Media

# Initialize image RAG prompt
image_rag = ImageRAGPrompt()

# Create multimodal RAG prompt
media_content = Media(data=image_data, media_type="image/png")
rag_prompt = image_rag.create_rag_prompt(
    query="What does this chart show?",
    retrieved_context=text_context,
    images=[media_content],
    instructions="Analyze both the text context and image content"
)