TLDR: Learn how to create a powerful semantic video search application by combining Twelve Labs' advanced multimodal embeddings with MongoDB Atlas Vector Search. This guide walks you through setup, embedding generation, data storage, index creation, and performing vector searches to unlock valuable insights from your video content. Big thanks to the MongoDB team (Soumya Pradhan, Ashwin Gangadhar, Rafael Liou, and Gregory Maxson) for collaborating with us on this integration guide.
โ
In today's data-driven world, effective video search is crucial. Traditional search methods struggle with the complex nature of video data, which includes visual cues, body language, spoken words, and context. Semantic video search addresses this challenge.
Using advanced foundation models, semantic video search interprets video content deeply, enabling accurate and relevant search results. By employing embeddingsโnumerical representations of video contentโit captures rich, multifaceted information, allowing for tasks like classification, clustering, and personalized recommendations.
Twelve Labs' Embed API provides cutting-edge multimodal embeddings that capture video content interactions over time. Paired with MongoDB Atlas Vector Search, a scalable vector database, developers can create powerful semantic video search applications to gain insights from video data.
This guide explains how to build a semantic video search app using Twelve Labs Embed API and MongoDB Atlas Vector Search. You'll learn to set up your environment, generate video embeddings, store them in MongoDB, create a vector search index, and perform vector searches to retrieve relevant video content. By the end, you'll have a robust framework for enhancing video search workflows.
โ
Before you begin building your semantic video search application using Twelve Labs Embed API and MongoDB Atlas Vector Search, ensure you have the following prerequisites in place:
twelvelabs
: For interacting with Twelve Labs SDK.pymongo
: For interacting with MongoDB SDK.dotenv
: For managing environment variables (optional but recommended).โ
In this section, we will guide you through setting up your programming environment to interact with Twelve Labs Embed API and MongoDB Atlas.
โ
First, install the required Python libraries. You can do this using pip
:
This will install the Twelve Labs SDK and the MongoDB Python driver, allowing you to interact with both services programmatically.
โ
Next, configure your Twelve Labs API client. Create a .env
file in your project directory to securely store your API key:
Then, create a Python script (e.g., config.py
) to load the API key from the .env
file:
Create a Twelve Labs client and connect to your cluster in a new Python script (e.g., twelvelabs_client.py
):
โ
To connect to your MongoDB Atlas cluster, you need the connection string provided by MongoDB. Store this connection string in your .env
file:
Then, update your config.py
to load the MongoDB URI:
Create a MongoDB client and connect to your cluster in a new Python script (e.g., mongo_client.py
):
โ
To verify that everything is set up correctly, run a simple script to print the names of collections in your MongoDB database:
If the script runs without errors and prints the collection names, your environment is set up correctly, and you are ready to proceed with uploading and creating video embeddings.
โ
In this section, we will guide you through the process of uploading videos to Twelve Labs and creating video embeddings using the Embed API. These embeddings will capture the rich multimodal context of your video content.
โ
To begin, you need to upload your video to Twelve Labs. Hereโs how you can do it:
Here is a sample Python script to create an embedding task:
This command starts the embedding process for the uploaded video.
โ
To ensure that the video embedding task is completed, you need to monitor its status. Use the following code to periodically check the status of the embedding task:
This function checks the status of the embedding task every 2 seconds and prints the status until the task is completed or fails.
โ
After the video embedding task is completed, you can retrieve the embeddings and store them in MongoDB Atlas for efficient vector search and retrieval.
โ
Use the following code to fetch the completed embeddings from Twelve Labs:
โ
Prepare the embedding data for insertion into MongoDB. Each embedding can be stored as a document in a MongoDB collection:
โ
To verify that the embeddings have been stored correctly, you can query the MongoDB collection and print some sample documents:
This code constructs a prompt by incorporating the relevant embeddings and sends it to the LLM for generating a response. The LLM uses the context provided by the embeddings to produce a more accurate and contextually relevant output.
โ
To enable efficient retrieval of video embeddings, you need to create a vector search index in MongoDB Atlas. This index will allow you to perform similarity searches on the stored embeddings.
โ
First, define the configuration for your vector search index. MongoDB Atlas requires specific fields to be indexed for vector search. Hereโs an example configuration:
This configuration specifies that the embedding
field will be indexed for vector search, with a vector dimension of 1024.
โ
You can create the index using the MongoDB Atlas UI by following the documentation:
โ
Alternatively, you can create the index programmatically using Python:
This script connects to your MongoDB Atlas cluster, selects the database and collection, and creates a vector search index on the embedding
field. The numDimensions
parameter should match the dimensionality of your embeddings, and the similarity
parameter specifies the similarity metric to use (e.g., cosine).
โ
With the vector search index in place, you can now perform vector searches to find similar video embeddings. This section will guide you through generating a query embedding, constructing a vector search query, and retrieving relevant results.
โ
To perform a vector search, you first need a query embedding. Use the Twelve Labs Embed API to generate an embedding for your query video or text. The code below shows how to do it for text (check out our doc):
โ
First, construct a vector search query using the generated query embedding. MongoDB Atlas supports vector searches through the $vectorSearch
operator.
Then, execute the vector search query and retrieve the results from MongoDB Atlas:
โ
Once you have the basic semantic video search functionality in place, there are several ways to enhance your application to make it more robust and user-friendly:
โ
To ensure the success and reliability of your semantic video search application, consider the following best practices and guidelines:
โ
In this guide, we have walked you through the process of building a semantic video search application using Twelve Labs Embed API and MongoDB Atlas Vector Search. By leveraging Twelve Labs' advanced multimodal embeddings and MongoDB's efficient vector search capabilities, you can unlock powerful insights from your video content. From setting up your environment and generating embeddings to creating a vector search index and performing searches, you now have a robust framework to enhance your video search workflow.
To further assist you in developing and optimizing your semantic video search application, here are some valuable resources:
These resources will help you deepen your understanding and make the most out of the powerful integration between Twelve Labs and MongoDB.
We are excited to announce Marengo 2.7 - a breakthrough in video understanding powered by our innovative multi-vector embedding architecture!
Introducing our new Embed API in Open Beta, enabling customers to generate state-of-the-art multimodal embeddings.
Learn how to build a semantic video search engine with the powerful integration of Twelve Labs' Embed API with ApertureDB for advanced semantic video search.
Whether you're looking to find the perfect berry-toned lipstick or just curious about spotting specific colors in your videos, this guide will help you leverage cutting-edge AI to do so effortlessly.