Partnerships

Partnerships

Partnerships

Unleashing Video Intelligence: A Tutorial for Integrating TwelveLabs Multimodal Embed API with Oracle Database 23ai

James Le, Manish Maheshwari, Danny Nicolopoulos, Alex Owen, Will Brennan

James Le, Manish Maheshwari, Danny Nicolopoulos, Alex Owen, Will Brennan

James Le, Manish Maheshwari, Danny Nicolopoulos, Alex Owen, Will Brennan

TwelveLabs Embed API's integration with Oracle Database 23ai marks a significant milestone in video understanding and search capabilities. By combining our advanced multimodal embeddings with Oracle's robust vector search functionality, we enable developers to build sophisticated video applications that process and understand video content with human-like perception.

TwelveLabs Embed API's integration with Oracle Database 23ai marks a significant milestone in video understanding and search capabilities. By combining our advanced multimodal embeddings with Oracle's robust vector search functionality, we enable developers to build sophisticated video applications that process and understand video content with human-like perception.

Join our newsletter

Receive the latest advancements, tutorials, and industry insights in video understanding

Apr 3, 2025

Apr 3, 2025

Apr 3, 2025

15 Min

15 Min

15 Min

Copy link to article

Copy link to article

Copy link to article

We'd like to give a huge thanks to Maxwell Bauman, Douglas Hood, Sean Stacey, Malu Castellanos, and other members from the Oracle team for working closely with us on the integration and the blog post!


The convergence of multimodal video understanding and enterprise-grade vector search capabilities represents a significant advancement in Generative AI applications. This tutorial explores the integration of TwelveLabs' cutting-edge Embed API with Oracle Database 23ai's AI Vector Search, providing developers with a powerful solution for storing, analyzing, and retrieving video content through semantic understanding.

By following this guide, you'll learn how to harness these complementary technologies to build sophisticated video search and analysis applications that capture the richness of visual, audio, and contextual elements while benefiting from a robust database infrastructure.


1 - Introduction


TwelveLabs Embed API

Our Embed API revolutionizes video understanding by transforming video data into sophisticated vector representations (i.e., embeddings). Unlike traditional approaches that focus on single modalities, our multimodal embedding technology seamlessly captures visual expressions, spoken words, body language, and contextual relationships. We treat video as a native medium with temporal dynamics, moving beyond simple frame-by-frame processing.

Our unified vector space technology consolidates multiple modalities into a cohesive representation that preserves the rich interactions between speech, visuals, and context. This enables nuanced search capabilities while significantly reducing processing time for large video libraries.

We also provide flexible video segmentation options, allowing you to create multiple embeddings from different video segments or a single embedding for an entire video. This enables precise retrieval of specific moments within videos.


Oracle 23ai Vector Search Features

Our partner Oracle incorporates AI Vector Search as a core feature of its converged database strategy with Oracle Database 23ai. Oracle’s approach provides native vector storage and similarity searches using specialized indexes and SQL functions. Unlike standalone vector databases that create silos, Oracle integrates vector functionality directly into the relational database, allowing seamless queries across structured and unstructured data.

The system supports multiple distance metrics (Cosine, Euclidean, Euclidean Squared, Dot Product, Manhattan, Hamming and Jaccard), giving flexibility in defining similarity based on use case needs. Optimized for scale, vector searches remain efficient even with millions of records, making it production-ready.

By integrating vectors within Oracle's established infrastructure, developers can use familiar SQL syntax and existing security models with cutting-edge AI functionality. This eliminates the need for separate specialized systems, reducing overall data architecture complexity.


The Business Value of Our Integration

Our integration with Oracle Database 23ai delivers significant business value through simplified architecture that unifies vector embeddings and relational data storage. This eliminates synchronization between separate systems, reducing inconsistencies and maintenance costs. Additionally, it reduces query complexity (separate queries to the relational DB and the vector store and then merging the results), and latency as results come faster.

  • Our solution enables semantic video search where users find content based on meaning rather than keywords, improving search accuracy and supporting cross-modal queries using text or visual content.

  • By leveraging Oracle's enterprise-grade reliability, security, and scalability, we provide a robust foundation for production deployments. Organizations can implement our video understanding technology with confidence using Oracle Database technology designed for mission-critical workloads, with vector operations remaining efficient as data volumes increase.

  • Our integration also enables advanced analytics for video content, allowing organizations to discover patterns and insights impossible to detect through manual review or traditional processing.

This tutorial covers practical implementation from infrastructure setup to building sophisticated video search applications, providing comprehensive knowledge to create intelligent video applications that deliver real business value.


2 - Create an Oracle Autonomous Database

Setting Up Oracle Cloud

Sign up for an Oracle Cloud Free Tier account at https://www.oracle.com/cloud/free/. You'll get $300 in free credits for 30 days plus Always Free services. Credit card required for verification only.

Create Database

  1. Log in to https://cloud.oracle.com/db/adb

  2. Navigate to Oracle Database → Autonomous Data Warehouse

  3. Click "Create Autonomous Database"

Configure Settings

  • Display name: "VideoEmbeddingsDB" (14 chars max)

  • Workload type: Data Warehouse

  • Deployment: Serverless

Essential Configurations

  • Database version: Select "23ai" for vector capabilities

  • Choose "Always Free" (2 OCPUs, 20GB storage)

Note: Database auto-stops after 7 days of inactivity, but will exist forever (for free) if you continue to use it within a 30 day period.

Set Admin Password

Create a strong password (12-30 chars, uppercase, lowercase, number, special character) for the ADMIN user.

Network Access

For development: Select "Allow secure access from everywhere"

Create Database

Click "Create Autonomous Database". Provisioning takes 2-5 minutes. Wait for status "Available".

Get Connection Wallet

  • Go to database details page

  • Click "Database Connection" → "Download Wallet"

  • Set wallet password

  • Save wallet ZIP file securely

  • Use ORACLE_DB_WALLET_PATH in your application

Your database is now ready for integrating with Twelve Labs Embed API. Next, we'll set up the required environment variables.


3 - Prerequisites and Set Environment Variables

Before integrating TwelveLabs Embed API with Oracle Database 23ai, ensure you have the necessary tools and credentials set up. This section covers the required software installation and environment configuration.


Required Software Installation

To successfully complete this tutorial, you'll need:

  • Oracle Database 23.4 or later with AI Vector Search capabilities

  • Python 3.8+ installed on your development machine

  • Oracle Client libraries for database connectivity

  • TwelveLabs API key for accessing the Embed API

Install the required Python packages:


The oracledb package provides Python connectivity to Oracle Database, while the twelvelabs package offers a convenient interface to the Twelve Labs API services.


Environment Configuration

Set up the following environment variables to securely store your connection credentials:

export ORACLE_DB_USERNAME=your_username
export ORACLE_DB_PASSWORD=your_password
export ORACLE_DB_CONNECT_STRING=your_connect_string
export ORACLE_DB_WALLET_PATH=/path/to/wallet
export TWELVE_LABS_API_KEY

Replace the placeholder values with your actual credentials:

  • your_username: The database username (typically ADMIN for a new Autonomous Database)

  • your_password: The password you created during database provisioning

  • your_connect_string: The service name from the tnsnames.ora file in your wallet

  • /path/to/wallet: The directory path where you extracted your Oracle wallet files

  • your_api_key: Your TwelveLabs API key obtained from the developer portal


4 - Setup Database Schema

In this section, we'll create the necessary database schema to store and query video embeddings from TwelveLabs. We'll use a Python script to establish the connection and create our table structure with an appropriate vector index.


Connecting to Oracle Database

The create_schema_video_embeddings.py script handles the connection to your Oracle Database instance using the environment variables you set in the previous section. Let's examine the key components of this script:

import oracledb

# Connect to Oracle Database 23.7
with oracledb.connect(
    user=db_username,
    password=db_password,
    dsn=db_connect_string,
    config_dir=db_wallet_path,
    wallet_location=db_wallet_path,
    wallet_password=db_password
) as connection:
    # Script operations will go here


Creating Tables for Video Embeddings

The script creates a video_embeddings table with the following structure:

CREATE TABLE video_embeddings (
    id VARCHAR2(100) PRIMARY KEY,
    video_file VARCHAR2(1000),
    start_time NUMBER,
    end_time NUMBER,
    embedding_vector VECTOR(1024, float64)
)

This table includes:

  • id: A unique identifier for each embedding

  • video_file: The source video filename or path

  • start_time and end_time: Timestamp markers for the video segment

  • embedding_vector: A 1024-dimensional vector using float64 precision to store the Twelve Labs embedding


Setting Up Vector Indexes

Once table creation is complete, we call create_vector_index to create a vector index for similarity search. We use cosine similarity (DISTANCE COSINE) with a target accuracy of 95%.

def create_vector_index(cursor):
    cursor.execute("""
        CREATE VECTOR INDEX video_embeddings_idx 
        ON video_embeddings(embedding_vector) 
        ORGANIZATION NEIGHBOR PARTITIONS
        DISTANCE COSINE
        WITH TARGET ACCURACY 95
    """)

Running the Schema Creation Script

Execute the schema creation script:

The script will:

  1. Connect to your Oracle database

  2. Drop the existing table if it exists

  3. Create a new video_embeddings table

  4. Create the specified vector index

  5. Confirm successful creation with console output

Once the schema setup is complete, we have the foundation for storing and querying TwelveLabs video embeddings in our Oracle Database 23ai instance.


5 - Store Video Embeddings

After setting up the database schema, the next step is to process videos through TwelveLabs Embed API and store the resulting embeddings in Oracle Database. The store_video_embeddings.py script handles this process entirely, managing both embedding generation and database storage.


Understanding the Script Workflow

The script performs several key operations:

  • Connects to Oracle Database using your environment variables

  • Initializes the TwelveLabs client with your API key

  • Creates embeddings for videos using TwelveLabs' Marengo model

  • Stores the embeddings in your database table

  • Maintains a record of processed videos to avoid redundant processing


Key Functions

Creating Embeddings

def create_video_embeddings(client, video_file):
    """Create embeddings for a video file using Twelve Labs Marengo"""
    task = client.embed.task.create(
        model_name=EMBEDDING_MODEL,
        video_file=video_file,
        video_clip_length=SEGMENT_DURATION
    )
    print(f"Created task: id={task.id} model_name={EMBEDDING_MODEL} status={task.status}")

    status = task.wait_for_done(
        sleep_interval=2,
        callback=on_task_update
    )
    print(f"Embedding done: {status}")
    
    return task.id

This function submits a video to TwelveLabs, which processes it using the Marengo 2.7 model. The video is segmented into 6-second clips (configurable via the SEGMENT_DURATION constant), and each segment receives its own embedding vector. The function returns a task ID that's used to retrieve the embeddings once processing completes.


Storing Embeddings

def store_embeddings_in_db(connection, task_id, video_file):
    """Store video embeddings in Oracle DB"""
    # Get embeddings from the task
    task = twelvelabs_client.embed.task.retrieve(task_id)
    
    # Get embeddings from the task
    if not task.video_embedding or not task.video_embedding.segments:
        print("No embeddings found")
        return
    
    insert_sql = """
    INSERT INTO video_embeddings (
        id, video_file, start_time, end_time, embedding_vector
    ) VALUES (
        :1, :2, :3, :4, :5
    )"""
    
    BATCH_SIZE = 1000
    data_batch = []
    
    # Process in batches of 1000 for efficiency
    with connection.cursor() as cursor:
        for idx, segment in enumerate(task.video_embedding.segments):
            id = f"{task_id}_{idx}"
            vector = array.array("f", segment.embeddings_float)
            
            data_batch.append([
                id,
                video_file,
                segment.start_offset_sec,
                segment.end_offset_sec,
                vector
            ])
            
            # Execute and commit every BATCH_SIZE rows
            if len(data_batch) >= BATCH_SIZE:
                print("insert data")
                cursor.executemany(insert_sql, data_batch)
                connection.commit()
                data_batch = []
        
        # Insert any remaining rows
        if data_batch:
            print("insert data final")
            cursor.executemany(insert_sql, data_batch)
            connection.commit()

    print(f"Stored {len(task.video_embedding.segments)} embeddings in database")

This function retrieves the completed embeddings from TwelveLabs and stores them in Oracle. Each embedding is stored with metadata including:

  • A unique ID combining the task ID and segment index

  • The source video filename

  • Start and end timestamps for the video segment

  • The embedding vector itself (1024 dimensions)

The function processes embeddings in batches of 1000 for optimal database performance.

Task ID Management

def load_task_ids():
    """Load existing task IDs from JSON file"""
    try:
        with open('video_task_ids.json', 'r') as f:
            return json.load(f)
    except FileNotFoundError:
        return {}

The script maintains a video_task_ids.json file that maps video paths to their TwelveLabs task IDs. This allows the script to skip reprocessing videos that have already been embedded, saving both time and API costs. If you need to re-embed a video, simply delete its entry from this file or delete the file entirely.


Running the Script

You can use the script in two ways:

For a single video file:

For a directory of videos:

When processing a directory, the script automatically filters for common video file extensions (.mp4, .avi, .mov, .mkv, .webm) and processes each matching file.


Monitoring Progress

The script provides real-time updates on:

  • Connection status to Oracle Database

  • Task creation and status updates from TwelveLabs

  • Embedding storage progress, including batch insert confirmations

  • Completion summaries showing the number of embeddings stored

Once the script completes, your video embeddings are stored in the video_embeddings table and ready for similarity searches, which we'll explore in the next section.


6 - Query Video Embeddings

After storing your video embeddings in Oracle Database, the real power comes from being able to search through your video content using natural language. The query_video_embeddings.py script enables semantic searching across your video repository, finding moments that match the meaning of your queries rather than just keywords.


Basic Similarity Search

The core of our search functionality relies on the following SQL query within the script:

SELECT video_file, start_time, end_time
FROM video_embeddings
ORDER BY vector_distance(embedding_vector, :1, COSINE)
FETCH FIRST :2 ROWS ONLY

This query utilizes Oracle's native vector_distance function with the COSINE similarity metric to compare your query embedding against all stored video embeddings. The results are ordered by similarity and limited to the top 2 matches.


Key Functions Explained

The script contains several important functions that work together to provide efficient video search:

Text-to-Embedding Conversion

def similarity_search(connection, query_text):
    # Create embedding for query
    embedding = twelvelabs_client.embed.create(
        model_name=EMBEDDING_MODEL,
        text=query_text,
        text_truncate="start",
    )
    
    if len(embedding.text_embedding.segments) > 1:
            print(f"Warning: Query generated {len(embedding.text_embedding.segments)} segments. Using only the first segment.")
    
    query_vector = array.array("d", embedding.text_embedding.segments[0].embeddings_float)
    
    # Search query
    search_sql = """
    SELECT video_file, start_time, end_time
    FROM video_embeddings
    ORDER BY vector_distance(embedding_vector, :1, COSINE)
    FETCH FIRST :2 ROWS ONLY
    """
    
    results = []
    cursor = connection.cursor()
    cursor.execute(search_sql, [query_vector, TOP_K])
    for row in cursor:
        results.append({
            'video_file': row[0],
            'start_time': row[1],
            'end_time': row[2]
        })
    cursor.close()
    return results

This function sends your text query to Twelve Labs' API, which returns an embedding vector that captures the semantic meaning of your text. The script automatically handles truncation for longer queries by preserving the beginning of the text (using text_truncate="start").

Multi-Query Processing

def similarity_search_multiple(connection, query_texts, batch_size=1000):
    """Perform multiple similarity searches using a list of query texts in batches"""
    results_by_query = {}
    
    # Process queries in batches
    for i in range(0, len(query_texts), batch_size):
        batch_queries = query_texts[i:i + batch_size]
        print(f"\nProcessing batch {i//batch_size + 1} ({len(batch_queries)} queries)")
        
        # Create embeddings for batch queries
        embeddings = []
        for query_text in batch_queries:
            embedding = twelvelabs_client.embed.create(
                model_name=EMBEDDING_MODEL,
                text=query_text,
                text_truncate="start",
            )
            
            if len(embedding.text_embedding.segments) > 1:
                print(f"Warning: Query '{query_text}' generated {len(embedding.text_embedding.segments)} segments. Using only the first segment.")
            
            query_vector = array.array("d", embedding.text_embedding.segments[0].embeddings_float)
            embeddings.append(query_vector)
            
        # Search query
        search_sql = """
        SELECT video_file, start_time, end_time
        FROM video_embeddings
        ORDER BY vector_distance(embedding_vector, :1, COSINE)
        FETCH FIRST :2 ROWS ONLY
        """
        
        with connection.cursor() as cursor:
            for query_text, query_vector in zip(batch_queries, embeddings):
                results = []
                for row in cursor.execute(search_sql, [query_vector, TOP_K]):
                    results.append({
                        'video_file': row[0],
                        'start_time': row[1],
                        'end_time': row[2]
                    })
                results_by_query[query_text] = results
    
    return results_by_query

For efficiency when searching with multiple queries, this function processes them in batches, reducing the number of database connections and API calls. It maintains a dictionary mapping each query to its results for easy retrieval and display.

Database Connection Management

def query_video_embeddings(query_text):
    connection = oracledb.connect(
        user=db_username,
        password=db_password,
        dsn=db_connect_string,
        config_dir=db_wallet_path,
        wallet_location=db_wallet_path,
        wallet_password=db_password
    )
    
    # Verify DB version
    db_version = tuple(int(s) for s in connection.version.split("."))[:2]
    if db_version < (23, 7):
        sys.exit("This example requires Oracle Database 23.7 or later")
    print("Connected to Oracle Database")
    
    print("\nSearching for relevant video segments...")
    results = similarity_search(connection, query_text)
    
    print("\nResults:")
    print("========")
    for r in results:
        print(f"Video: {r['video_file']}")
        print(f"Segment: {r['start_time']:.1f}s to {r['end_time']:.1f}s\n")

The script handles all the database connection details using your environment variables, ensuring secure and efficient access to your Oracle Database instance.


Running the Script

To search for video segments matching your queries, run the script with one or more text queries as arguments:

python3 query_video_embeddings.py "people dancing at a party" "someone explaining AI concepts"

The script accepts any number of queries and processes them efficiently. Each query is:

  1. Converted to an embedding vector using Twelve Labs' Marengo-2.7 model

  2. Compared against all video segment embeddings using cosine similarity

  3. Matched with the top 2 most semantically similar video segments


Understanding the Results

The script outputs results in an easy-to-read format:

Connected to Oracle Database

Searching for relevant video segments...

Results:
========

Query: 'people dancing at a party'
----------------------------------
Video: birthday_celebration.mp4
Segment: 15.0s to 21.0s

Video: summer_festival.mp4
Segment: 45.5s to 51.5s

Query: 'someone explaining AI concepts'
--------------------------------------
Video: tech_lecture.mp4
Segment: 120.0s to 126.0s

Video: developer_conference.mp4
Segment: 75.5s to 81

For each query, you'll see the most relevant video segments listed with their filenames and precise timestamps. This allows you to quickly locate and view the specific parts of videos that match your search criteria.

The cosine similarity metric ensures that results are based on semantic meaning rather than exact keyword matches. This means you can find relevant content even when the exact words in your query don't appear in the video's image frames, speech or captions.


7 - Conclusion

Our Embed API's integration with Oracle Database 23ai marks a significant milestone in video understanding and search capabilities. By combining our advanced multimodal embeddings with Oracle's robust vector search functionality, we enable developers to build sophisticated video applications that process and understand video content with human-like perception.


Business Value

Our solution delivers several key benefits for businesses:

  • Enhanced Video Search: Our semantic search capabilities enable users to find relevant video segments based on meaning rather than keywords, improving user engagement and content discovery.

  • Unified Infrastructure: Working with Oracle's converged database strategy, we help eliminate the need for separate specialized databases, reducing complexity and costs while enhancing scalability and security.

  • Innovation and Leadership: Through our cutting-edge AI technologies, organizations can differentiate themselves in the market and establish leadership in video content management and analysis.


Technical Advantages

Our integration provides developers with:

  • Streamlined Development: Access to our advanced video understanding capabilities through a unified platform for both relational and vector data, simplifying development workflows.

  • High-Performance Search: Our state-of-the-art embeddings, combined with Oracle's vector indexes, enable fast and accurate similarity searches across large video libraries.

  • Scalability and Reliability: Our video foundation models, trained on Oracle Cloud Infrastructure (OCI), ensure enterprise-grade reliability and scalability.


The "Better Together" Story

Our partnership with Oracle runs deep - we've leveraged Oracle Cloud Infrastructure to train our video foundation models, enabling us to develop models that understand videos like humans do and beyond. By integrating our Embed API with Oracle Database 23ai, we're demonstrating how our complementary technologies create powerful solutions for our customers.

In conclusion, our Embed API's integration with Oracle Database 23ai represents a breakthrough for businesses and developers looking to unlock the full potential of video content. This partnership exemplifies how our advanced video understanding technology, combined with Oracle's enterprise-grade infrastructure, can transform industries.


Resources

We'd like to give a huge thanks to Maxwell Bauman, Douglas Hood, Sean Stacey, Malu Castellanos, and other members from the Oracle team for working closely with us on the integration and the blog post!


The convergence of multimodal video understanding and enterprise-grade vector search capabilities represents a significant advancement in Generative AI applications. This tutorial explores the integration of TwelveLabs' cutting-edge Embed API with Oracle Database 23ai's AI Vector Search, providing developers with a powerful solution for storing, analyzing, and retrieving video content through semantic understanding.

By following this guide, you'll learn how to harness these complementary technologies to build sophisticated video search and analysis applications that capture the richness of visual, audio, and contextual elements while benefiting from a robust database infrastructure.


1 - Introduction


TwelveLabs Embed API

Our Embed API revolutionizes video understanding by transforming video data into sophisticated vector representations (i.e., embeddings). Unlike traditional approaches that focus on single modalities, our multimodal embedding technology seamlessly captures visual expressions, spoken words, body language, and contextual relationships. We treat video as a native medium with temporal dynamics, moving beyond simple frame-by-frame processing.

Our unified vector space technology consolidates multiple modalities into a cohesive representation that preserves the rich interactions between speech, visuals, and context. This enables nuanced search capabilities while significantly reducing processing time for large video libraries.

We also provide flexible video segmentation options, allowing you to create multiple embeddings from different video segments or a single embedding for an entire video. This enables precise retrieval of specific moments within videos.


Oracle 23ai Vector Search Features

Our partner Oracle incorporates AI Vector Search as a core feature of its converged database strategy with Oracle Database 23ai. Oracle’s approach provides native vector storage and similarity searches using specialized indexes and SQL functions. Unlike standalone vector databases that create silos, Oracle integrates vector functionality directly into the relational database, allowing seamless queries across structured and unstructured data.

The system supports multiple distance metrics (Cosine, Euclidean, Euclidean Squared, Dot Product, Manhattan, Hamming and Jaccard), giving flexibility in defining similarity based on use case needs. Optimized for scale, vector searches remain efficient even with millions of records, making it production-ready.

By integrating vectors within Oracle's established infrastructure, developers can use familiar SQL syntax and existing security models with cutting-edge AI functionality. This eliminates the need for separate specialized systems, reducing overall data architecture complexity.


The Business Value of Our Integration

Our integration with Oracle Database 23ai delivers significant business value through simplified architecture that unifies vector embeddings and relational data storage. This eliminates synchronization between separate systems, reducing inconsistencies and maintenance costs. Additionally, it reduces query complexity (separate queries to the relational DB and the vector store and then merging the results), and latency as results come faster.

  • Our solution enables semantic video search where users find content based on meaning rather than keywords, improving search accuracy and supporting cross-modal queries using text or visual content.

  • By leveraging Oracle's enterprise-grade reliability, security, and scalability, we provide a robust foundation for production deployments. Organizations can implement our video understanding technology with confidence using Oracle Database technology designed for mission-critical workloads, with vector operations remaining efficient as data volumes increase.

  • Our integration also enables advanced analytics for video content, allowing organizations to discover patterns and insights impossible to detect through manual review or traditional processing.

This tutorial covers practical implementation from infrastructure setup to building sophisticated video search applications, providing comprehensive knowledge to create intelligent video applications that deliver real business value.


2 - Create an Oracle Autonomous Database

Setting Up Oracle Cloud

Sign up for an Oracle Cloud Free Tier account at https://www.oracle.com/cloud/free/. You'll get $300 in free credits for 30 days plus Always Free services. Credit card required for verification only.

Create Database

  1. Log in to https://cloud.oracle.com/db/adb

  2. Navigate to Oracle Database → Autonomous Data Warehouse

  3. Click "Create Autonomous Database"

Configure Settings

  • Display name: "VideoEmbeddingsDB" (14 chars max)

  • Workload type: Data Warehouse

  • Deployment: Serverless

Essential Configurations

  • Database version: Select "23ai" for vector capabilities

  • Choose "Always Free" (2 OCPUs, 20GB storage)

Note: Database auto-stops after 7 days of inactivity, but will exist forever (for free) if you continue to use it within a 30 day period.

Set Admin Password

Create a strong password (12-30 chars, uppercase, lowercase, number, special character) for the ADMIN user.

Network Access

For development: Select "Allow secure access from everywhere"

Create Database

Click "Create Autonomous Database". Provisioning takes 2-5 minutes. Wait for status "Available".

Get Connection Wallet

  • Go to database details page

  • Click "Database Connection" → "Download Wallet"

  • Set wallet password

  • Save wallet ZIP file securely

  • Use ORACLE_DB_WALLET_PATH in your application

Your database is now ready for integrating with Twelve Labs Embed API. Next, we'll set up the required environment variables.


3 - Prerequisites and Set Environment Variables

Before integrating TwelveLabs Embed API with Oracle Database 23ai, ensure you have the necessary tools and credentials set up. This section covers the required software installation and environment configuration.


Required Software Installation

To successfully complete this tutorial, you'll need:

  • Oracle Database 23.4 or later with AI Vector Search capabilities

  • Python 3.8+ installed on your development machine

  • Oracle Client libraries for database connectivity

  • TwelveLabs API key for accessing the Embed API

Install the required Python packages:


The oracledb package provides Python connectivity to Oracle Database, while the twelvelabs package offers a convenient interface to the Twelve Labs API services.


Environment Configuration

Set up the following environment variables to securely store your connection credentials:

export ORACLE_DB_USERNAME=your_username
export ORACLE_DB_PASSWORD=your_password
export ORACLE_DB_CONNECT_STRING=your_connect_string
export ORACLE_DB_WALLET_PATH=/path/to/wallet
export TWELVE_LABS_API_KEY

Replace the placeholder values with your actual credentials:

  • your_username: The database username (typically ADMIN for a new Autonomous Database)

  • your_password: The password you created during database provisioning

  • your_connect_string: The service name from the tnsnames.ora file in your wallet

  • /path/to/wallet: The directory path where you extracted your Oracle wallet files

  • your_api_key: Your TwelveLabs API key obtained from the developer portal


4 - Setup Database Schema

In this section, we'll create the necessary database schema to store and query video embeddings from TwelveLabs. We'll use a Python script to establish the connection and create our table structure with an appropriate vector index.


Connecting to Oracle Database

The create_schema_video_embeddings.py script handles the connection to your Oracle Database instance using the environment variables you set in the previous section. Let's examine the key components of this script:

import oracledb

# Connect to Oracle Database 23.7
with oracledb.connect(
    user=db_username,
    password=db_password,
    dsn=db_connect_string,
    config_dir=db_wallet_path,
    wallet_location=db_wallet_path,
    wallet_password=db_password
) as connection:
    # Script operations will go here


Creating Tables for Video Embeddings

The script creates a video_embeddings table with the following structure:

CREATE TABLE video_embeddings (
    id VARCHAR2(100) PRIMARY KEY,
    video_file VARCHAR2(1000),
    start_time NUMBER,
    end_time NUMBER,
    embedding_vector VECTOR(1024, float64)
)

This table includes:

  • id: A unique identifier for each embedding

  • video_file: The source video filename or path

  • start_time and end_time: Timestamp markers for the video segment

  • embedding_vector: A 1024-dimensional vector using float64 precision to store the Twelve Labs embedding


Setting Up Vector Indexes

Once table creation is complete, we call create_vector_index to create a vector index for similarity search. We use cosine similarity (DISTANCE COSINE) with a target accuracy of 95%.

def create_vector_index(cursor):
    cursor.execute("""
        CREATE VECTOR INDEX video_embeddings_idx 
        ON video_embeddings(embedding_vector) 
        ORGANIZATION NEIGHBOR PARTITIONS
        DISTANCE COSINE
        WITH TARGET ACCURACY 95
    """)

Running the Schema Creation Script

Execute the schema creation script:

The script will:

  1. Connect to your Oracle database

  2. Drop the existing table if it exists

  3. Create a new video_embeddings table

  4. Create the specified vector index

  5. Confirm successful creation with console output

Once the schema setup is complete, we have the foundation for storing and querying TwelveLabs video embeddings in our Oracle Database 23ai instance.


5 - Store Video Embeddings

After setting up the database schema, the next step is to process videos through TwelveLabs Embed API and store the resulting embeddings in Oracle Database. The store_video_embeddings.py script handles this process entirely, managing both embedding generation and database storage.


Understanding the Script Workflow

The script performs several key operations:

  • Connects to Oracle Database using your environment variables

  • Initializes the TwelveLabs client with your API key

  • Creates embeddings for videos using TwelveLabs' Marengo model

  • Stores the embeddings in your database table

  • Maintains a record of processed videos to avoid redundant processing


Key Functions

Creating Embeddings

def create_video_embeddings(client, video_file):
    """Create embeddings for a video file using Twelve Labs Marengo"""
    task = client.embed.task.create(
        model_name=EMBEDDING_MODEL,
        video_file=video_file,
        video_clip_length=SEGMENT_DURATION
    )
    print(f"Created task: id={task.id} model_name={EMBEDDING_MODEL} status={task.status}")

    status = task.wait_for_done(
        sleep_interval=2,
        callback=on_task_update
    )
    print(f"Embedding done: {status}")
    
    return task.id

This function submits a video to TwelveLabs, which processes it using the Marengo 2.7 model. The video is segmented into 6-second clips (configurable via the SEGMENT_DURATION constant), and each segment receives its own embedding vector. The function returns a task ID that's used to retrieve the embeddings once processing completes.


Storing Embeddings

def store_embeddings_in_db(connection, task_id, video_file):
    """Store video embeddings in Oracle DB"""
    # Get embeddings from the task
    task = twelvelabs_client.embed.task.retrieve(task_id)
    
    # Get embeddings from the task
    if not task.video_embedding or not task.video_embedding.segments:
        print("No embeddings found")
        return
    
    insert_sql = """
    INSERT INTO video_embeddings (
        id, video_file, start_time, end_time, embedding_vector
    ) VALUES (
        :1, :2, :3, :4, :5
    )"""
    
    BATCH_SIZE = 1000
    data_batch = []
    
    # Process in batches of 1000 for efficiency
    with connection.cursor() as cursor:
        for idx, segment in enumerate(task.video_embedding.segments):
            id = f"{task_id}_{idx}"
            vector = array.array("f", segment.embeddings_float)
            
            data_batch.append([
                id,
                video_file,
                segment.start_offset_sec,
                segment.end_offset_sec,
                vector
            ])
            
            # Execute and commit every BATCH_SIZE rows
            if len(data_batch) >= BATCH_SIZE:
                print("insert data")
                cursor.executemany(insert_sql, data_batch)
                connection.commit()
                data_batch = []
        
        # Insert any remaining rows
        if data_batch:
            print("insert data final")
            cursor.executemany(insert_sql, data_batch)
            connection.commit()

    print(f"Stored {len(task.video_embedding.segments)} embeddings in database")

This function retrieves the completed embeddings from TwelveLabs and stores them in Oracle. Each embedding is stored with metadata including:

  • A unique ID combining the task ID and segment index

  • The source video filename

  • Start and end timestamps for the video segment

  • The embedding vector itself (1024 dimensions)

The function processes embeddings in batches of 1000 for optimal database performance.

Task ID Management

def load_task_ids():
    """Load existing task IDs from JSON file"""
    try:
        with open('video_task_ids.json', 'r') as f:
            return json.load(f)
    except FileNotFoundError:
        return {}

The script maintains a video_task_ids.json file that maps video paths to their TwelveLabs task IDs. This allows the script to skip reprocessing videos that have already been embedded, saving both time and API costs. If you need to re-embed a video, simply delete its entry from this file or delete the file entirely.


Running the Script

You can use the script in two ways:

For a single video file:

For a directory of videos:

When processing a directory, the script automatically filters for common video file extensions (.mp4, .avi, .mov, .mkv, .webm) and processes each matching file.


Monitoring Progress

The script provides real-time updates on:

  • Connection status to Oracle Database

  • Task creation and status updates from TwelveLabs

  • Embedding storage progress, including batch insert confirmations

  • Completion summaries showing the number of embeddings stored

Once the script completes, your video embeddings are stored in the video_embeddings table and ready for similarity searches, which we'll explore in the next section.


6 - Query Video Embeddings

After storing your video embeddings in Oracle Database, the real power comes from being able to search through your video content using natural language. The query_video_embeddings.py script enables semantic searching across your video repository, finding moments that match the meaning of your queries rather than just keywords.


Basic Similarity Search

The core of our search functionality relies on the following SQL query within the script:

SELECT video_file, start_time, end_time
FROM video_embeddings
ORDER BY vector_distance(embedding_vector, :1, COSINE)
FETCH FIRST :2 ROWS ONLY

This query utilizes Oracle's native vector_distance function with the COSINE similarity metric to compare your query embedding against all stored video embeddings. The results are ordered by similarity and limited to the top 2 matches.


Key Functions Explained

The script contains several important functions that work together to provide efficient video search:

Text-to-Embedding Conversion

def similarity_search(connection, query_text):
    # Create embedding for query
    embedding = twelvelabs_client.embed.create(
        model_name=EMBEDDING_MODEL,
        text=query_text,
        text_truncate="start",
    )
    
    if len(embedding.text_embedding.segments) > 1:
            print(f"Warning: Query generated {len(embedding.text_embedding.segments)} segments. Using only the first segment.")
    
    query_vector = array.array("d", embedding.text_embedding.segments[0].embeddings_float)
    
    # Search query
    search_sql = """
    SELECT video_file, start_time, end_time
    FROM video_embeddings
    ORDER BY vector_distance(embedding_vector, :1, COSINE)
    FETCH FIRST :2 ROWS ONLY
    """
    
    results = []
    cursor = connection.cursor()
    cursor.execute(search_sql, [query_vector, TOP_K])
    for row in cursor:
        results.append({
            'video_file': row[0],
            'start_time': row[1],
            'end_time': row[2]
        })
    cursor.close()
    return results

This function sends your text query to Twelve Labs' API, which returns an embedding vector that captures the semantic meaning of your text. The script automatically handles truncation for longer queries by preserving the beginning of the text (using text_truncate="start").

Multi-Query Processing

def similarity_search_multiple(connection, query_texts, batch_size=1000):
    """Perform multiple similarity searches using a list of query texts in batches"""
    results_by_query = {}
    
    # Process queries in batches
    for i in range(0, len(query_texts), batch_size):
        batch_queries = query_texts[i:i + batch_size]
        print(f"\nProcessing batch {i//batch_size + 1} ({len(batch_queries)} queries)")
        
        # Create embeddings for batch queries
        embeddings = []
        for query_text in batch_queries:
            embedding = twelvelabs_client.embed.create(
                model_name=EMBEDDING_MODEL,
                text=query_text,
                text_truncate="start",
            )
            
            if len(embedding.text_embedding.segments) > 1:
                print(f"Warning: Query '{query_text}' generated {len(embedding.text_embedding.segments)} segments. Using only the first segment.")
            
            query_vector = array.array("d", embedding.text_embedding.segments[0].embeddings_float)
            embeddings.append(query_vector)
            
        # Search query
        search_sql = """
        SELECT video_file, start_time, end_time
        FROM video_embeddings
        ORDER BY vector_distance(embedding_vector, :1, COSINE)
        FETCH FIRST :2 ROWS ONLY
        """
        
        with connection.cursor() as cursor:
            for query_text, query_vector in zip(batch_queries, embeddings):
                results = []
                for row in cursor.execute(search_sql, [query_vector, TOP_K]):
                    results.append({
                        'video_file': row[0],
                        'start_time': row[1],
                        'end_time': row[2]
                    })
                results_by_query[query_text] = results
    
    return results_by_query

For efficiency when searching with multiple queries, this function processes them in batches, reducing the number of database connections and API calls. It maintains a dictionary mapping each query to its results for easy retrieval and display.

Database Connection Management

def query_video_embeddings(query_text):
    connection = oracledb.connect(
        user=db_username,
        password=db_password,
        dsn=db_connect_string,
        config_dir=db_wallet_path,
        wallet_location=db_wallet_path,
        wallet_password=db_password
    )
    
    # Verify DB version
    db_version = tuple(int(s) for s in connection.version.split("."))[:2]
    if db_version < (23, 7):
        sys.exit("This example requires Oracle Database 23.7 or later")
    print("Connected to Oracle Database")
    
    print("\nSearching for relevant video segments...")
    results = similarity_search(connection, query_text)
    
    print("\nResults:")
    print("========")
    for r in results:
        print(f"Video: {r['video_file']}")
        print(f"Segment: {r['start_time']:.1f}s to {r['end_time']:.1f}s\n")

The script handles all the database connection details using your environment variables, ensuring secure and efficient access to your Oracle Database instance.


Running the Script

To search for video segments matching your queries, run the script with one or more text queries as arguments:

python3 query_video_embeddings.py "people dancing at a party" "someone explaining AI concepts"

The script accepts any number of queries and processes them efficiently. Each query is:

  1. Converted to an embedding vector using Twelve Labs' Marengo-2.7 model

  2. Compared against all video segment embeddings using cosine similarity

  3. Matched with the top 2 most semantically similar video segments


Understanding the Results

The script outputs results in an easy-to-read format:

Connected to Oracle Database

Searching for relevant video segments...

Results:
========

Query: 'people dancing at a party'
----------------------------------
Video: birthday_celebration.mp4
Segment: 15.0s to 21.0s

Video: summer_festival.mp4
Segment: 45.5s to 51.5s

Query: 'someone explaining AI concepts'
--------------------------------------
Video: tech_lecture.mp4
Segment: 120.0s to 126.0s

Video: developer_conference.mp4
Segment: 75.5s to 81

For each query, you'll see the most relevant video segments listed with their filenames and precise timestamps. This allows you to quickly locate and view the specific parts of videos that match your search criteria.

The cosine similarity metric ensures that results are based on semantic meaning rather than exact keyword matches. This means you can find relevant content even when the exact words in your query don't appear in the video's image frames, speech or captions.


7 - Conclusion

Our Embed API's integration with Oracle Database 23ai marks a significant milestone in video understanding and search capabilities. By combining our advanced multimodal embeddings with Oracle's robust vector search functionality, we enable developers to build sophisticated video applications that process and understand video content with human-like perception.


Business Value

Our solution delivers several key benefits for businesses:

  • Enhanced Video Search: Our semantic search capabilities enable users to find relevant video segments based on meaning rather than keywords, improving user engagement and content discovery.

  • Unified Infrastructure: Working with Oracle's converged database strategy, we help eliminate the need for separate specialized databases, reducing complexity and costs while enhancing scalability and security.

  • Innovation and Leadership: Through our cutting-edge AI technologies, organizations can differentiate themselves in the market and establish leadership in video content management and analysis.


Technical Advantages

Our integration provides developers with:

  • Streamlined Development: Access to our advanced video understanding capabilities through a unified platform for both relational and vector data, simplifying development workflows.

  • High-Performance Search: Our state-of-the-art embeddings, combined with Oracle's vector indexes, enable fast and accurate similarity searches across large video libraries.

  • Scalability and Reliability: Our video foundation models, trained on Oracle Cloud Infrastructure (OCI), ensure enterprise-grade reliability and scalability.


The "Better Together" Story

Our partnership with Oracle runs deep - we've leveraged Oracle Cloud Infrastructure to train our video foundation models, enabling us to develop models that understand videos like humans do and beyond. By integrating our Embed API with Oracle Database 23ai, we're demonstrating how our complementary technologies create powerful solutions for our customers.

In conclusion, our Embed API's integration with Oracle Database 23ai represents a breakthrough for businesses and developers looking to unlock the full potential of video content. This partnership exemplifies how our advanced video understanding technology, combined with Oracle's enterprise-grade infrastructure, can transform industries.


Resources

We'd like to give a huge thanks to Maxwell Bauman, Douglas Hood, Sean Stacey, Malu Castellanos, and other members from the Oracle team for working closely with us on the integration and the blog post!


The convergence of multimodal video understanding and enterprise-grade vector search capabilities represents a significant advancement in Generative AI applications. This tutorial explores the integration of TwelveLabs' cutting-edge Embed API with Oracle Database 23ai's AI Vector Search, providing developers with a powerful solution for storing, analyzing, and retrieving video content through semantic understanding.

By following this guide, you'll learn how to harness these complementary technologies to build sophisticated video search and analysis applications that capture the richness of visual, audio, and contextual elements while benefiting from a robust database infrastructure.


1 - Introduction


TwelveLabs Embed API

Our Embed API revolutionizes video understanding by transforming video data into sophisticated vector representations (i.e., embeddings). Unlike traditional approaches that focus on single modalities, our multimodal embedding technology seamlessly captures visual expressions, spoken words, body language, and contextual relationships. We treat video as a native medium with temporal dynamics, moving beyond simple frame-by-frame processing.

Our unified vector space technology consolidates multiple modalities into a cohesive representation that preserves the rich interactions between speech, visuals, and context. This enables nuanced search capabilities while significantly reducing processing time for large video libraries.

We also provide flexible video segmentation options, allowing you to create multiple embeddings from different video segments or a single embedding for an entire video. This enables precise retrieval of specific moments within videos.


Oracle 23ai Vector Search Features

Our partner Oracle incorporates AI Vector Search as a core feature of its converged database strategy with Oracle Database 23ai. Oracle’s approach provides native vector storage and similarity searches using specialized indexes and SQL functions. Unlike standalone vector databases that create silos, Oracle integrates vector functionality directly into the relational database, allowing seamless queries across structured and unstructured data.

The system supports multiple distance metrics (Cosine, Euclidean, Euclidean Squared, Dot Product, Manhattan, Hamming and Jaccard), giving flexibility in defining similarity based on use case needs. Optimized for scale, vector searches remain efficient even with millions of records, making it production-ready.

By integrating vectors within Oracle's established infrastructure, developers can use familiar SQL syntax and existing security models with cutting-edge AI functionality. This eliminates the need for separate specialized systems, reducing overall data architecture complexity.


The Business Value of Our Integration

Our integration with Oracle Database 23ai delivers significant business value through simplified architecture that unifies vector embeddings and relational data storage. This eliminates synchronization between separate systems, reducing inconsistencies and maintenance costs. Additionally, it reduces query complexity (separate queries to the relational DB and the vector store and then merging the results), and latency as results come faster.

  • Our solution enables semantic video search where users find content based on meaning rather than keywords, improving search accuracy and supporting cross-modal queries using text or visual content.

  • By leveraging Oracle's enterprise-grade reliability, security, and scalability, we provide a robust foundation for production deployments. Organizations can implement our video understanding technology with confidence using Oracle Database technology designed for mission-critical workloads, with vector operations remaining efficient as data volumes increase.

  • Our integration also enables advanced analytics for video content, allowing organizations to discover patterns and insights impossible to detect through manual review or traditional processing.

This tutorial covers practical implementation from infrastructure setup to building sophisticated video search applications, providing comprehensive knowledge to create intelligent video applications that deliver real business value.


2 - Create an Oracle Autonomous Database

Setting Up Oracle Cloud

Sign up for an Oracle Cloud Free Tier account at https://www.oracle.com/cloud/free/. You'll get $300 in free credits for 30 days plus Always Free services. Credit card required for verification only.

Create Database

  1. Log in to https://cloud.oracle.com/db/adb

  2. Navigate to Oracle Database → Autonomous Data Warehouse

  3. Click "Create Autonomous Database"

Configure Settings

  • Display name: "VideoEmbeddingsDB" (14 chars max)

  • Workload type: Data Warehouse

  • Deployment: Serverless

Essential Configurations

  • Database version: Select "23ai" for vector capabilities

  • Choose "Always Free" (2 OCPUs, 20GB storage)

Note: Database auto-stops after 7 days of inactivity, but will exist forever (for free) if you continue to use it within a 30 day period.

Set Admin Password

Create a strong password (12-30 chars, uppercase, lowercase, number, special character) for the ADMIN user.

Network Access

For development: Select "Allow secure access from everywhere"

Create Database

Click "Create Autonomous Database". Provisioning takes 2-5 minutes. Wait for status "Available".

Get Connection Wallet

  • Go to database details page

  • Click "Database Connection" → "Download Wallet"

  • Set wallet password

  • Save wallet ZIP file securely

  • Use ORACLE_DB_WALLET_PATH in your application

Your database is now ready for integrating with Twelve Labs Embed API. Next, we'll set up the required environment variables.


3 - Prerequisites and Set Environment Variables

Before integrating TwelveLabs Embed API with Oracle Database 23ai, ensure you have the necessary tools and credentials set up. This section covers the required software installation and environment configuration.


Required Software Installation

To successfully complete this tutorial, you'll need:

  • Oracle Database 23.4 or later with AI Vector Search capabilities

  • Python 3.8+ installed on your development machine

  • Oracle Client libraries for database connectivity

  • TwelveLabs API key for accessing the Embed API

Install the required Python packages:


The oracledb package provides Python connectivity to Oracle Database, while the twelvelabs package offers a convenient interface to the Twelve Labs API services.


Environment Configuration

Set up the following environment variables to securely store your connection credentials:

export ORACLE_DB_USERNAME=your_username
export ORACLE_DB_PASSWORD=your_password
export ORACLE_DB_CONNECT_STRING=your_connect_string
export ORACLE_DB_WALLET_PATH=/path/to/wallet
export TWELVE_LABS_API_KEY

Replace the placeholder values with your actual credentials:

  • your_username: The database username (typically ADMIN for a new Autonomous Database)

  • your_password: The password you created during database provisioning

  • your_connect_string: The service name from the tnsnames.ora file in your wallet

  • /path/to/wallet: The directory path where you extracted your Oracle wallet files

  • your_api_key: Your TwelveLabs API key obtained from the developer portal


4 - Setup Database Schema

In this section, we'll create the necessary database schema to store and query video embeddings from TwelveLabs. We'll use a Python script to establish the connection and create our table structure with an appropriate vector index.


Connecting to Oracle Database

The create_schema_video_embeddings.py script handles the connection to your Oracle Database instance using the environment variables you set in the previous section. Let's examine the key components of this script:

import oracledb

# Connect to Oracle Database 23.7
with oracledb.connect(
    user=db_username,
    password=db_password,
    dsn=db_connect_string,
    config_dir=db_wallet_path,
    wallet_location=db_wallet_path,
    wallet_password=db_password
) as connection:
    # Script operations will go here


Creating Tables for Video Embeddings

The script creates a video_embeddings table with the following structure:

CREATE TABLE video_embeddings (
    id VARCHAR2(100) PRIMARY KEY,
    video_file VARCHAR2(1000),
    start_time NUMBER,
    end_time NUMBER,
    embedding_vector VECTOR(1024, float64)
)

This table includes:

  • id: A unique identifier for each embedding

  • video_file: The source video filename or path

  • start_time and end_time: Timestamp markers for the video segment

  • embedding_vector: A 1024-dimensional vector using float64 precision to store the Twelve Labs embedding


Setting Up Vector Indexes

Once table creation is complete, we call create_vector_index to create a vector index for similarity search. We use cosine similarity (DISTANCE COSINE) with a target accuracy of 95%.

def create_vector_index(cursor):
    cursor.execute("""
        CREATE VECTOR INDEX video_embeddings_idx 
        ON video_embeddings(embedding_vector) 
        ORGANIZATION NEIGHBOR PARTITIONS
        DISTANCE COSINE
        WITH TARGET ACCURACY 95
    """)

Running the Schema Creation Script

Execute the schema creation script:

The script will:

  1. Connect to your Oracle database

  2. Drop the existing table if it exists

  3. Create a new video_embeddings table

  4. Create the specified vector index

  5. Confirm successful creation with console output

Once the schema setup is complete, we have the foundation for storing and querying TwelveLabs video embeddings in our Oracle Database 23ai instance.


5 - Store Video Embeddings

After setting up the database schema, the next step is to process videos through TwelveLabs Embed API and store the resulting embeddings in Oracle Database. The store_video_embeddings.py script handles this process entirely, managing both embedding generation and database storage.


Understanding the Script Workflow

The script performs several key operations:

  • Connects to Oracle Database using your environment variables

  • Initializes the TwelveLabs client with your API key

  • Creates embeddings for videos using TwelveLabs' Marengo model

  • Stores the embeddings in your database table

  • Maintains a record of processed videos to avoid redundant processing


Key Functions

Creating Embeddings

def create_video_embeddings(client, video_file):
    """Create embeddings for a video file using Twelve Labs Marengo"""
    task = client.embed.task.create(
        model_name=EMBEDDING_MODEL,
        video_file=video_file,
        video_clip_length=SEGMENT_DURATION
    )
    print(f"Created task: id={task.id} model_name={EMBEDDING_MODEL} status={task.status}")

    status = task.wait_for_done(
        sleep_interval=2,
        callback=on_task_update
    )
    print(f"Embedding done: {status}")
    
    return task.id

This function submits a video to TwelveLabs, which processes it using the Marengo 2.7 model. The video is segmented into 6-second clips (configurable via the SEGMENT_DURATION constant), and each segment receives its own embedding vector. The function returns a task ID that's used to retrieve the embeddings once processing completes.


Storing Embeddings

def store_embeddings_in_db(connection, task_id, video_file):
    """Store video embeddings in Oracle DB"""
    # Get embeddings from the task
    task = twelvelabs_client.embed.task.retrieve(task_id)
    
    # Get embeddings from the task
    if not task.video_embedding or not task.video_embedding.segments:
        print("No embeddings found")
        return
    
    insert_sql = """
    INSERT INTO video_embeddings (
        id, video_file, start_time, end_time, embedding_vector
    ) VALUES (
        :1, :2, :3, :4, :5
    )"""
    
    BATCH_SIZE = 1000
    data_batch = []
    
    # Process in batches of 1000 for efficiency
    with connection.cursor() as cursor:
        for idx, segment in enumerate(task.video_embedding.segments):
            id = f"{task_id}_{idx}"
            vector = array.array("f", segment.embeddings_float)
            
            data_batch.append([
                id,
                video_file,
                segment.start_offset_sec,
                segment.end_offset_sec,
                vector
            ])
            
            # Execute and commit every BATCH_SIZE rows
            if len(data_batch) >= BATCH_SIZE:
                print("insert data")
                cursor.executemany(insert_sql, data_batch)
                connection.commit()
                data_batch = []
        
        # Insert any remaining rows
        if data_batch:
            print("insert data final")
            cursor.executemany(insert_sql, data_batch)
            connection.commit()

    print(f"Stored {len(task.video_embedding.segments)} embeddings in database")

This function retrieves the completed embeddings from TwelveLabs and stores them in Oracle. Each embedding is stored with metadata including:

  • A unique ID combining the task ID and segment index

  • The source video filename

  • Start and end timestamps for the video segment

  • The embedding vector itself (1024 dimensions)

The function processes embeddings in batches of 1000 for optimal database performance.

Task ID Management

def load_task_ids():
    """Load existing task IDs from JSON file"""
    try:
        with open('video_task_ids.json', 'r') as f:
            return json.load(f)
    except FileNotFoundError:
        return {}

The script maintains a video_task_ids.json file that maps video paths to their TwelveLabs task IDs. This allows the script to skip reprocessing videos that have already been embedded, saving both time and API costs. If you need to re-embed a video, simply delete its entry from this file or delete the file entirely.


Running the Script

You can use the script in two ways:

For a single video file:

For a directory of videos:

When processing a directory, the script automatically filters for common video file extensions (.mp4, .avi, .mov, .mkv, .webm) and processes each matching file.


Monitoring Progress

The script provides real-time updates on:

  • Connection status to Oracle Database

  • Task creation and status updates from TwelveLabs

  • Embedding storage progress, including batch insert confirmations

  • Completion summaries showing the number of embeddings stored

Once the script completes, your video embeddings are stored in the video_embeddings table and ready for similarity searches, which we'll explore in the next section.


6 - Query Video Embeddings

After storing your video embeddings in Oracle Database, the real power comes from being able to search through your video content using natural language. The query_video_embeddings.py script enables semantic searching across your video repository, finding moments that match the meaning of your queries rather than just keywords.


Basic Similarity Search

The core of our search functionality relies on the following SQL query within the script:

SELECT video_file, start_time, end_time
FROM video_embeddings
ORDER BY vector_distance(embedding_vector, :1, COSINE)
FETCH FIRST :2 ROWS ONLY

This query utilizes Oracle's native vector_distance function with the COSINE similarity metric to compare your query embedding against all stored video embeddings. The results are ordered by similarity and limited to the top 2 matches.


Key Functions Explained

The script contains several important functions that work together to provide efficient video search:

Text-to-Embedding Conversion

def similarity_search(connection, query_text):
    # Create embedding for query
    embedding = twelvelabs_client.embed.create(
        model_name=EMBEDDING_MODEL,
        text=query_text,
        text_truncate="start",
    )
    
    if len(embedding.text_embedding.segments) > 1:
            print(f"Warning: Query generated {len(embedding.text_embedding.segments)} segments. Using only the first segment.")
    
    query_vector = array.array("d", embedding.text_embedding.segments[0].embeddings_float)
    
    # Search query
    search_sql = """
    SELECT video_file, start_time, end_time
    FROM video_embeddings
    ORDER BY vector_distance(embedding_vector, :1, COSINE)
    FETCH FIRST :2 ROWS ONLY
    """
    
    results = []
    cursor = connection.cursor()
    cursor.execute(search_sql, [query_vector, TOP_K])
    for row in cursor:
        results.append({
            'video_file': row[0],
            'start_time': row[1],
            'end_time': row[2]
        })
    cursor.close()
    return results

This function sends your text query to Twelve Labs' API, which returns an embedding vector that captures the semantic meaning of your text. The script automatically handles truncation for longer queries by preserving the beginning of the text (using text_truncate="start").

Multi-Query Processing

def similarity_search_multiple(connection, query_texts, batch_size=1000):
    """Perform multiple similarity searches using a list of query texts in batches"""
    results_by_query = {}
    
    # Process queries in batches
    for i in range(0, len(query_texts), batch_size):
        batch_queries = query_texts[i:i + batch_size]
        print(f"\nProcessing batch {i//batch_size + 1} ({len(batch_queries)} queries)")
        
        # Create embeddings for batch queries
        embeddings = []
        for query_text in batch_queries:
            embedding = twelvelabs_client.embed.create(
                model_name=EMBEDDING_MODEL,
                text=query_text,
                text_truncate="start",
            )
            
            if len(embedding.text_embedding.segments) > 1:
                print(f"Warning: Query '{query_text}' generated {len(embedding.text_embedding.segments)} segments. Using only the first segment.")
            
            query_vector = array.array("d", embedding.text_embedding.segments[0].embeddings_float)
            embeddings.append(query_vector)
            
        # Search query
        search_sql = """
        SELECT video_file, start_time, end_time
        FROM video_embeddings
        ORDER BY vector_distance(embedding_vector, :1, COSINE)
        FETCH FIRST :2 ROWS ONLY
        """
        
        with connection.cursor() as cursor:
            for query_text, query_vector in zip(batch_queries, embeddings):
                results = []
                for row in cursor.execute(search_sql, [query_vector, TOP_K]):
                    results.append({
                        'video_file': row[0],
                        'start_time': row[1],
                        'end_time': row[2]
                    })
                results_by_query[query_text] = results
    
    return results_by_query

For efficiency when searching with multiple queries, this function processes them in batches, reducing the number of database connections and API calls. It maintains a dictionary mapping each query to its results for easy retrieval and display.

Database Connection Management

def query_video_embeddings(query_text):
    connection = oracledb.connect(
        user=db_username,
        password=db_password,
        dsn=db_connect_string,
        config_dir=db_wallet_path,
        wallet_location=db_wallet_path,
        wallet_password=db_password
    )
    
    # Verify DB version
    db_version = tuple(int(s) for s in connection.version.split("."))[:2]
    if db_version < (23, 7):
        sys.exit("This example requires Oracle Database 23.7 or later")
    print("Connected to Oracle Database")
    
    print("\nSearching for relevant video segments...")
    results = similarity_search(connection, query_text)
    
    print("\nResults:")
    print("========")
    for r in results:
        print(f"Video: {r['video_file']}")
        print(f"Segment: {r['start_time']:.1f}s to {r['end_time']:.1f}s\n")

The script handles all the database connection details using your environment variables, ensuring secure and efficient access to your Oracle Database instance.


Running the Script

To search for video segments matching your queries, run the script with one or more text queries as arguments:

python3 query_video_embeddings.py "people dancing at a party" "someone explaining AI concepts"

The script accepts any number of queries and processes them efficiently. Each query is:

  1. Converted to an embedding vector using Twelve Labs' Marengo-2.7 model

  2. Compared against all video segment embeddings using cosine similarity

  3. Matched with the top 2 most semantically similar video segments


Understanding the Results

The script outputs results in an easy-to-read format:

Connected to Oracle Database

Searching for relevant video segments...

Results:
========

Query: 'people dancing at a party'
----------------------------------
Video: birthday_celebration.mp4
Segment: 15.0s to 21.0s

Video: summer_festival.mp4
Segment: 45.5s to 51.5s

Query: 'someone explaining AI concepts'
--------------------------------------
Video: tech_lecture.mp4
Segment: 120.0s to 126.0s

Video: developer_conference.mp4
Segment: 75.5s to 81

For each query, you'll see the most relevant video segments listed with their filenames and precise timestamps. This allows you to quickly locate and view the specific parts of videos that match your search criteria.

The cosine similarity metric ensures that results are based on semantic meaning rather than exact keyword matches. This means you can find relevant content even when the exact words in your query don't appear in the video's image frames, speech or captions.


7 - Conclusion

Our Embed API's integration with Oracle Database 23ai marks a significant milestone in video understanding and search capabilities. By combining our advanced multimodal embeddings with Oracle's robust vector search functionality, we enable developers to build sophisticated video applications that process and understand video content with human-like perception.


Business Value

Our solution delivers several key benefits for businesses:

  • Enhanced Video Search: Our semantic search capabilities enable users to find relevant video segments based on meaning rather than keywords, improving user engagement and content discovery.

  • Unified Infrastructure: Working with Oracle's converged database strategy, we help eliminate the need for separate specialized databases, reducing complexity and costs while enhancing scalability and security.

  • Innovation and Leadership: Through our cutting-edge AI technologies, organizations can differentiate themselves in the market and establish leadership in video content management and analysis.


Technical Advantages

Our integration provides developers with:

  • Streamlined Development: Access to our advanced video understanding capabilities through a unified platform for both relational and vector data, simplifying development workflows.

  • High-Performance Search: Our state-of-the-art embeddings, combined with Oracle's vector indexes, enable fast and accurate similarity searches across large video libraries.

  • Scalability and Reliability: Our video foundation models, trained on Oracle Cloud Infrastructure (OCI), ensure enterprise-grade reliability and scalability.


The "Better Together" Story

Our partnership with Oracle runs deep - we've leveraged Oracle Cloud Infrastructure to train our video foundation models, enabling us to develop models that understand videos like humans do and beyond. By integrating our Embed API with Oracle Database 23ai, we're demonstrating how our complementary technologies create powerful solutions for our customers.

In conclusion, our Embed API's integration with Oracle Database 23ai represents a breakthrough for businesses and developers looking to unlock the full potential of video content. This partnership exemplifies how our advanced video understanding technology, combined with Oracle's enterprise-grade infrastructure, can transform industries.


Resources