Author
James Le, Manish Maheshwari
Date Published
July 26, 2024
Tags
Embeddings
Partnership
API Tutorial
Vector Database
Embed API
Share
Join our newsletter
You’re now subscribed to the Twelve Labs Newsletter! You'll be getting the latest news and updates in video understanding.
Oh no, something went wrong.
Please try again.
TLDR: Learn how to create a powerful semantic video search application by combining Twelve Labs' advanced multimodal embeddings with MongoDB Atlas Vector Search. This guide walks you through setup, embedding generation, data storage, index creation, and performing vector searches to unlock valuable insights from your video content. Big thanks to the MongoDB team (Soumya Pradhan, Ashwin Gangadhar, Rafael Liou, and Gregory Maxson) for collaborating with us on this integration guide.

1 - Introduction

In today's data-driven world, effective video search is crucial. Traditional search methods struggle with the complex nature of video data, which includes visual cues, body language, spoken words, and context. Semantic video search addresses this challenge.

Using advanced foundation models, semantic video search interprets video content deeply, enabling accurate and relevant search results. By employing embeddings—numerical representations of video content—it captures rich, multifaceted information, allowing for tasks like classification, clustering, and personalized recommendations.

Twelve Labs' Embed API provides cutting-edge multimodal embeddings that capture video content interactions over time. Paired with MongoDB Atlas Vector Search, a scalable vector database, developers can create powerful semantic video search applications to gain insights from video data.

This guide explains how to build a semantic video search app using Twelve Labs Embed API and MongoDB Atlas Vector Search. You'll learn to set up your environment, generate video embeddings, store them in MongoDB, create a vector search index, and perform vector searches to retrieve relevant video content. By the end, you'll have a robust framework for enhancing video search workflows.

2 - Prerequisites

Before you begin building your semantic video search application using Twelve Labs Embed API and MongoDB Atlas Vector Search, ensure you have the following prerequisites in place:

  • Twelve Labs Account and API Key: Sign up for an account on Twelve Labs' platform and obtain your API key, which will be used to authenticate requests to the Embed API. This feature is currently in limited Beta and accessible exclusively to a select group of users. Please register on this waitlist to request access.
  • MongoDB Atlas Account and Cluster: Create an account on MongoDB Atlas and set up a cluster where you will store and manage your video embeddings.
  • Programming Environment Setup: Set up your development environment with the necessary tools and libraries. We recommend using Python for this implementation. Ensure you have Python installed, along with the following libraries:
    • twelvelabs: For interacting with Twelve Labs SDK.
    • pymongo: For interacting with MongoDB SDK.
    • dotenv: For managing environment variables (optional but recommended).

3 - Setting Up the Environment

In this section, we will guide you through setting up your programming environment to interact with Twelve Labs Embed API and MongoDB Atlas.

3.1 - Installing Required Libraries

First, install the required Python libraries. You can do this using pip:

pip install twelvelabs pymongo python-dotenv

This will install the Twelve Labs SDK and the MongoDB Python driver, allowing you to interact with both services programmatically.

3.2 - Configuring Twelve Labs API Client

Next, configure your Twelve Labs API client. Create a .env file in your project directory to securely store your API key:

TL_API_KEY=your_twelve_labs_api_key

Then, create a Python script (e.g., config.py) to load the API key from the .env file:

import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Retrieve Twelve Labs API key
TL_API_KEY = os.getenv("TL_API_KEY")

Create a Twelve Labs client and connect to your cluster in a new Python script (e.g., twelvelabs_client.py):

from twelvelabs import TwelveLabs
from config import TL_API_KEY

# Initialize the Twelve Labs client
tl_client = TwelveLabs(api_key=TL_API_KEY)

3.3 - Connecting to MongoDB Atlas Cluster

To connect to your MongoDB Atlas cluster, you need the connection string provided by MongoDB. Store this connection string in your .env file:

MONGODB_URI=your_mongodb_connection_string

Then, update your config.py to load the MongoDB URI:

# Retrieve MongoDB URI
MONGODB_URI = os.getenv("MONGODB_URI")

Create a MongoDB client and connect to your cluster in a new Python script (e.g., mongo_client.py):

from pymongo import MongoClient
from config import MONGODB_URI

# Create a MongoDB client
mongo_client = MongoClient(MONGODB_URI)

# Connect to your database
db = mongo_client.your_database_name

3.4 - Verifying the Setup

To verify that everything is set up correctly, run a simple script to print the names of collections in your MongoDB database:

from mongo_client import db

# Print the names of collections in the database
print(db.list_collection_names())

If the script runs without errors and prints the collection names, your environment is set up correctly, and you are ready to proceed with uploading and creating video embeddings.

4 - Uploading and Creating Video Embeddings

In this section, we will guide you through the process of uploading videos to Twelve Labs and creating video embeddings using the Embed API. These embeddings will capture the rich multimodal context of your video content.

4.1 - Uploading a Video to Twelve Labs

To begin, you need to upload your video to Twelve Labs. Here’s how you can do it:

  1. Prepare the Video File: Ensure your video file is in a supported format and is accessible from your local machine or a URL.
  2. Create an Embedding Task: Use the Twelve Labs Embed API to create a new video embedding task. This task will handle the uploading and processing of your video to generate embeddings.

Here is a sample Python script to create an embedding task:

from twelvelabs import TwelveLabs
from config import TL_API_KEY
from twelvelabs.models.embed import EmbeddingsTask

# Initialize the Twelve Labs client
tl_client = TwelveLabs(api_key=TL_API_KEY)

# Create a video embedding task for the uploaded video
task = tl_client.embed.task.create(
    engine_name="Marengo-retrieval-2.6",
    video_url="your-video-url"
)
print(
    f"Created task: id={task.id} engine_name={task.engine_name} status={task.status}"
)

This command starts the embedding process for the uploaded video.

4.2 - Monitoring Task Progress

To ensure that the video embedding task is completed, you need to monitor its status. Use the following code to periodically check the status of the embedding task:

# Monitor the status of the video embedding task
def on_task_update(task: EmbeddingsTask):
    print(f"  Status={task.status}")

status = task.wait_for_done(
    sleep_interval=2,
    callback=on_task_update
)
print(f"Embedding done: {status}")

This function checks the status of the embedding task every 2 seconds and prints the status until the task is completed or fails.

5 - Retrieving and Storing Video Embeddings

After the video embedding task is completed, you can retrieve the embeddings and store them in MongoDB Atlas for efficient vector search and retrieval.

5.1 - Fetching Completed Embeddings

Use the following code to fetch the completed embeddings from Twelve Labs:

# Retrieve the video embeddings
task_result = tl_client.embed.task.retrieve(task.id)
map(dict, task_result.video_embeddings)

# Store returned embeddings in an array called "records"
records = []
for ele in task_result.video_embeddings:
  foo = dict(ele)
  foo['embedding'] = foo['embedding'].float
  records += [foo]

5.2 - Structuring Embedding Data for MongoDB

Prepare the embedding data for insertion into MongoDB. Each embedding can be stored as a document in a MongoDB collection:

from pymongo import MongoClient
from config import MONGODB_URI

# Create a MongoDB client and connect to the database
mongo_client = MongoClient(MONGODB_URI)
db = mongo_client["mydatabase"]
collection = db["embeddings"]

# Insert the embeddings into the MongoDB collection
collection.insert_many(records)
num_embeddings = collection.count_documents({})
print(f"Inserted {num_embeddings} embeddings into MongoDB.")

5.3 - Verifying Data Storage

To verify that the embeddings have been stored correctly, you can query the MongoDB collection and print some sample documents:

# Retrieve and print a sample document
sample_document = collection.find_one()
print("Sample document from MongoDB:", sample_document)

This code constructs a prompt by incorporating the relevant embeddings and sends it to the LLM for generating a response. The LLM uses the context provided by the embeddings to produce a more accurate and contextually relevant output.

6 - Creating a Vector Search Index in Atlas

To enable efficient retrieval of video embeddings, you need to create a vector search index in MongoDB Atlas. This index will allow you to perform similarity searches on the stored embeddings.

6.1. Defining the Index Configuration

First, define the configuration for your vector search index. MongoDB Atlas requires specific fields to be indexed for vector search. Here’s an example configuration:

{
  "fields": [{
    "type": "vector",
    "path": "embedding",
    "numDimensions": 1024,
    "similarity": "cosine"
  }]
}

This configuration specifies that the embedding field will be indexed for vector search, with a vector dimension of 1024.

6.2 - Creating the Index Using MongoDB Atlas UI

You can create the index using the MongoDB Atlas UI by following the documentation:

  1. Navigate to Your Cluster: Go to the MongoDB Atlas dashboard and select your cluster.
  2. Access Collections: Click on the "Collections" tab.
  3. Select Database and Collection: Choose the database and collection where your embeddings are stored.
  4. Create Index: Click on the "Indexes" tab and then "Create Index".
  5. Configure Index: Enter the index configuration as shown above and create the index.

6.3 - Creating the Index Programmatically

Alternatively, you can create the index programmatically using Python:

from pymongo import MongoClient, SearchIndexModel
from config import MONGODB_URI

# Connect to MongoDB Atlas
mongodb_client = MongoClient(MONGODB_URI)
db = mongo_client["mydatabase"]
collection = db["embeddings"]

# Define the index configuration
vector_index_model = SearchIndexModel(
 definition={
   "fields": [
     {
       "type": "vector",
       "numDimensions": 1024,
       "path": "embedding",
       "similarity": "cosine"
     }
   ]
 },
 name="vector_index",
 type="vectorSearch",
)

# Create the vector search index
collection.create_search_index(model=vector_index_model)
print("Vector search index created successfully.")

This script connects to your MongoDB Atlas cluster, selects the database and collection, and creates a vector search index on the embedding field. The numDimensions parameter should match the dimensionality of your embeddings, and the similarity parameter specifies the similarity metric to use (e.g., cosine).

7 - Performing Vector Search

With the vector search index in place, you can now perform vector searches to find similar video embeddings. This section will guide you through generating a query embedding, constructing a vector search query, and retrieving relevant results.

7.1 - Generating a Query Embedding

To perform a vector search, you first need a query embedding. Use the Twelve Labs Embed API to generate an embedding for your query video or text. The code below shows how to do it for text (check out our doc):

from twelvelabs import TwelveLabs
from config import TL_API_KEY

# Initialize the Twelve Labs client
tl_client = TwelveLabs(api_key=TL_API_KEY)

# Create a text embedding task for the text
embedding = tl_client.embed.create(
  engine_name="Marengo-retrieval-2.6",
  text="your-text"
)

print("Created a text embedding")
print(f" Engine: {embedding.engine_name}")
print(f" Embedding: {embedding.text_embedding.float}")

# Extract the query embedding
query_embedding = embedding.text_embedding.float
print("Query embedding generated successfully.")

7.2 - Constructing a Vector Search Query and Performing Vector Search

First, construct a vector search query using the generated query embedding. MongoDB Atlas supports vector searches through the $vectorSearchoperator.

Then, execute the vector search query and retrieve the results from MongoDB Atlas:

# Perform the vector search
search_results = collection.aggregate([
		{"$vectorSearch": 
			{
			  "queryVector": query_embedding,
			  "path": "embedding",
			  "numCandidates": 3,
			  "index": "vector_index",
			  "limit": 2,
			}
		}
	]
)

# Process and display the search results
for result in search_results:
    print(result['embedding'])

8 - Enhancing the Application

Once you have the basic semantic video search functionality in place, there are several ways to enhance your application to make it more robust and user-friendly:

  1. Implementing Pagination: For large result sets, implement pagination to improve the user experience and manage the volume of data returned in each query. This can be achieved by using MongoDB's skip and limit functions to fetch a subset of results.
  2. Adding Metadata Filters: Enhance search capabilities by adding filters based on video metadata such as duration, upload date, or tags. This allows users to refine their search results and find the most relevant content quickly.
  3. Optimizing Search Performance: Optimize the performance of your vector search by tuning index configurations and query parameters. Consider using MongoDB's aggregation framework to preprocess and filter data before performing vector searches.

9 - Best Practices and Considerations

To ensure the success and reliability of your semantic video search application, consider the following best practices and guidelines:

  1. Handling Large Video Collections: When dealing with large video collections, ensure that your system can scale efficiently. Use MongoDB Atlas's sharding capabilities to distribute data across multiple nodes, improving performance and reliability.
  2. Updating Embeddings for Modified Videos: If videos are updated or modified, regenerate and update their embeddings in the database to maintain the accuracy of your search results. Implement a versioning system to track changes and manage updates seamlessly.
  3. Securing API Keys and Database Connections: Protect your Twelve Labs API key and MongoDB connection string by storing them securely, such as in environment variables or a secrets management service. Regularly rotate keys and use least privilege access controls.

10 - Conclusion

In this guide, we have walked you through the process of building a semantic video search application using Twelve Labs Embed API and MongoDB Atlas Vector Search. By leveraging Twelve Labs' advanced multimodal embeddings and MongoDB's efficient vector search capabilities, you can unlock powerful insights from your video content. From setting up your environment and generating embeddings to creating a vector search index and performing searches, you now have a robust framework to enhance your video search workflow.

To further assist you in developing and optimizing your semantic video search application, here are some valuable resources:

These resources will help you deepen your understanding and make the most out of the powerful integration between Twelve Labs and MongoDB.

Generation Examples
No items found.
No items found.
Comparison against existing models
No items found.

Related articles

Building a Shade Finder App: Using Twelve Labs' API to Pinpoint Specific Colors in Videos

Whether you're looking to find the perfect berry-toned lipstick or just curious about spotting specific colors in your videos, this guide will help you leverage cutting-edge AI to do so effortlessly.

Meeran Kim
Building Advanced Video Understanding Applications: Integrating Twelve Labs Embed API with LanceDB for Multimodal AI

Leverage Twelve Labs Embed API and LanceDB to create AI applications that can process and analyze video content with unprecedented accuracy and efficiency.

James Le, Manish Maheshwari
Advanced Video Search: Leveraging Twelve Labs and Milvus for Semantic Retrieval

Harness the power of Twelve Labs' advanced multimodal embeddings and Milvus' efficient vector database to create a robust video search solution.

James Le, Manish Maheshwari
Unlocking Video Insights: The Power of Phyllo and Twelve Labs Collaboration

The collaboration between Phyllo and Twelve Labs is set to revolutionize how we derive insights from video content on social media

James Le