Author
Meeran Kim
Date Published
August 21, 2024
Tags
Search API
Image-to-Video Search
API Tutorial
Applications
Developers
Share
Join our newsletter
You’re now subscribed to the Twelve Labs Newsletter! You'll be getting the latest news and updates in video understanding.
Oh no, something went wrong.
Please try again.

Introduction

Have you ever wanted to pinpoint specific color shades in a video, perhaps to find a product or a particular moment that features your favorite hues? Recently, I received a personal color consulting service and discovered that berry shades suit me best.

As I combed through my collection of archived YouTube videos, I wished there was an easy way to identify products in those exact shades. Fortunately, with the power of Twelve Labs' image-to-video search technology, I was able to create an app that does just that.

In this tutorial, I'll walk you through how I built this "Shade Finder" app using the Twelve Labs API. Whether you're looking to find the perfect berry-toned lipstick or just curious about spotting specific colors in your videos, this guide will help you leverage cutting-edge AI to do so effortlessly. Let’s dive in!

📌 Check out the Demo!

Prerequisites

  • Visit the Twelve Labs Playground, sign up, and generate your API key.
  • Next, create an index and upload videos into this index. Once that's done, you're ready to dive into video searching! 
  • The app was built with JavaScript and Node. 
  • The repository containing all the files for this app is available on Github.

Table of Contents

The structure of the app is straightforward and easy to follow. At a high level, it consists of three main components: index.html, script.js, and server.js

We'll begin with a quick overview of index.html, then dive into the flow for both the server and client sides, covering how to get videos, retrieve a single video, perform image-based searches, and paginate search results using a page token.

HTML

The index.html file acts as the skeleton of the app, providing the basic structure and layout. The server.js file is responsible for managing all the API calls to the Twelve Labs API via SDK, ensuring that the app can efficiently process and return the relevant data.The script.js file functions as the client-side logic, handling user interactions, making requests to the server, and performing the search operations. 

Below is the body of index.html, which lays out the core components of the app: 

  • An image carousel for displaying query images
  • A search button to initiate queries
  • A video list that presents videos from a given index
  • A search results section that displays the results after the search button is clicked

Index.html


<body>
<h1 class="text-3xl text-center m-5 p-3"><i class="fa-solid fa-palette"></i> Shade Finder</h1>


 <div class="m-5 p-3">
   <p class="text-center m-5" id="color-label"></p>
   <div class="flex justify-center gap-5">
     <button id="prev"><i class="fa-solid fa-chevron-left"></i></button>
     <div class="size-40"><img id="carousel-image"></div>
     <button id="next"><i class="fa-solid fa-chevron-right"></i></button>
   </div>
   <div class="flex justify-center m-5 gap-2">
     <button id="search" class="bg-lime-400 py-2.5 px-3">Search</button>
   </div>
 </div>


 <div id ="video-list-container" class="container max-w-5xl mx-auto py-4">
 <div id ="video-list-loading" class="container max-w-5xl mx-auto py-4"> </div>
 <div id="video-list" class="grid grid-cols-1 sm:grid-cols-2 md:grid-cols-3 lg:grid-cols-4 gap-2 justify-items-center"></div>
 <div id="video-list-pagination" class="flex justify-center m-5 gap-2"></div>
 </div>


 <div id="search-result-container" class="container w-5/6 mx-auto py-4 hidden">
 <div id="search-result-list" class="grid grid-cols-1 md:grid-cols-4 justify-center"></div>
 </div>


 <script src="./script.js"></script>
</body>

Server

server.js is the file that manages all the api calls to the Twelve Labs API. There are four routes: get/paginate videos, get video, (image to video) search, and get search results by page token. 

Four Requests for Twelve Labs API

💡Twelve Labs provides SDKs that enable you to integrate and utilize the platform within your application. In this app, the Javascript SDK (version 0.2.5) was used. 

Set-ups

1 - Store your Twelve Labs API Key and Index Id in .env

Inside the backend folder, you will find a .env file with the keys commented out. Uncomment them and update the values. 

.env

TWELVE_LABS_API_KEY=<YOUR API KEY>
TWELVE_LABS_INDEX_ID=<YOUR_INDEX_ID>

2 - Install and Import Twelve Labs SDK 

First, Install the twelvelabs-js package.

yarn add twelvelabs-js # or npm i twelvelabs-js

Then, Import the required packages into your application. I’ve imported them in server.js (built with Node.js) as shown below. 

server.js (line 7 - 9)

const fs = require("fs");
const path = require("path");
const { TwelveLabs } = require("twelvelabs-js");

Finally, you can instantiate the SDK client with your API key.  

server.js (line 23)

const client = new TwelveLabs({ apiKey: API_KEY });

See the full picture below. 

server.js (line 1 - 25)

"use strict";


const express = require("express");
const dotenv = require("dotenv");
const cors = require("cors");
const asyncHandler = require("express-async-handler");
const fs = require("fs");
const path = require("path");
const { TwelveLabs } = require("twelvelabs-js");


dotenv.config();


const app = express();


app.use(express.json());
app.use(cors());
app.use(express.static(path.join(__dirname, "../frontend/public")));


const PORT = 5001;
const API_KEY = process.env.TWELVE_LABS_API_KEY;
const INDEX_ID = process.env.TWELVE_LABS_INDEX_ID;


const client = new TwelveLabs({ apiKey: API_KEY });


app.listen(PORT, () => console.log(`Server running on port ${PORT}`));
Request 1. Get/Paginate Videos

To retrieve videos by page, you can use client.index.video.listPagination and pass in the index id along with the desired page. Optionally, you can include the pageLimit parameter to control how many videos are returned per page. 

💡 Tip: Parameters available in the API documentation (for all requests) can be used within the Javascript SDK by converting them to camelCase.

After receiving the videos, I extract the id and metadata for later use and return them along with the pageInfo.

💡 Check out the Guides for details on paginating videos. The official example code is super helpful as well!

server.js (line 34 - 55)

/** Get videos */
app.get(
 "/videos",
 asyncHandler(async (req, res, next) => {
   const { page_limit, page } = req.query;


   const videosResponse = await client.index.video.listPagination(INDEX_ID, {
     pageLimit: page_limit,
     page: page,
   });


   const videos = videosResponse.data.map((video) => ({
     id: video.id,
     metadata: video.metadata,
   }));


   res.json({
     videos,
     page_info: videosResponse.pageInfo,
   });
 })
);

Videos Response


videosResponse= VideoListWithPagination {  
  ...,
  data: [
    Video {
      _resource: [Video],
      _indexId: '...',
      id: '...',
      metadata: [Object],
      hls: undefined,
      source: undefined,
      indexedAt: '2024-06-27T05:11:29Z',
      createdAt: '2024-06-27T05:01:35Z',
      updatedAt: '2024-06-27T05:01:52Z'
    },
  	 ...
       ],
  pageInfo: {
    page: 1,
    limitPerPage: 12,
    totalPage: 3,
    totalResults: 29,
    totalDuration: 19122
  }
}
Request 2. Get Video

Similar to getting/paginating videos, you can get details of a single video using client.index.video.retrieve by passing in index id and video id. 

After receiving the video, I extract only the necessary information - metadata, hls, and source - and return them. Specifically, I later use the video title from the metadata, thumbnailUrls from the HLS, and the url from the source.

server.js (line 57 - 71)

/** Get a video of an index */
app.get(
 "/videos/:videoId",
 asyncHandler(async (req, res, next) => {
   const { videoId } = req.params;


   const videoResponse = await client.index.video.retrieve(INDEX_ID, videoId);


   res.json({
     metadata: videoResponse.metadata,
     hls: videoResponse.hls,
     source: videoResponse.source,
   });
 })
);

Video Response


videoResponse= Video {
  ...,
  id: '...',
  metadata: {
    duration: 54,
    engine_ids: [ 'marengo2.6', 'pegasus1.1' ],
    filename: 'tirtir korean cushion review',
    fps: 30,
    height: 1280,
    size: 9601300,
    video_title: 'tirtir korean cushion review',
    width: 720
  },
  hls: {
    videoUrl: '... .m3u8',
    thumbnailUrls: ['... .jpg'],
    status: 'COMPLETE',
    updatedAt: '2024-05-22T02:49:49.074Z'
  },
  source: {
    type: 'youtube',
    name: 'theoliviasaurusrex',
    url: 'https://www.youtube.com/watch?v=tOabvdtTa-U'
  },
  indexedAt: '2024-05-22T03:03:53Z',
  createdAt: '2024-05-22T02:49:28Z',
  updatedAt: '2024-05-22T02:49:36Z'
}
Request 3. Search (Image to Video)

Now comes the exciting part! You can perform an image-to-video search using the client.search.query method by passing in four required parameters: indexId, queryMediaFile, queryMediaType, and options

Especially, to correctly pass in the queryMediaFile, a few steps are necessary:

  • Path Construction: First, you need to construct the full path to the image file. In this app, all the images are already stored in the images folder. So this is done using the current directory (__dirname), a relative path which is  ../frontend/public/images for this app, and the image filename (imageSrc).
  • Existence Check: After constructing the path, you check if the image file exists at that location. If the file is not found, a 404 error response is returned to the client.
  • Read Stream Creation:If the file does exist, a readable stream is created from the image file. This stream is then sent efficiently to the Twelve Labs API.

In this app, I also included optional parameters such as threshold, pageLimit, and adjustConfidenceLevel. You can find the full list of parameters in the API documentation

After receiving the videos, I extract the data and pageInfo for later use and return them to the client.

💡 Be sure to check out the Guides for more details on image query search. The official example code is very helpful as well!

server.js (line 73 - 105)

/** Search videos based on an image query */
app.get(
 "/search",
 asyncHandler(async (req, res, next) => {
   const { imageSrc, threshold, pageLimit, adjustConfidenceLevel } = req.query;


   const imagePath = path.join(
     __dirname,
     "../frontend/public/images",
     imageSrc
   );


   if (!fs.existsSync(imagePath)) {
     console.error("Image not found at path:", imagePath);
     return res.status(404).json({ error: "Image not found" });
   }


   const searchResponse = await client.search.query({
     indexId: INDEX_ID,
     queryMediaFile: fs.createReadStream(imagePath),
     queryMediaType: "image",
     options: ["visual"],
     threshold: threshold,
     pageLimit: pageLimit,
     adjustConfidenceLevel: adjustConfidenceLevel,
   });


   res.json({
     searchResults: searchResponse.data,
     pageInfo: searchResponse.pageInfo,
   });
 })
);

Search Response


searchResponse= SearchResult {
  ...,
  pool: {
    totalCount: 29,
    totalDuration: 19122,
    indexId: '...'
  },
  data: [
    {
      score: 84.45,
      start: 379.13333333341933,
      end: 381,
      metadata: [Array],
      videoId: '...',
      confidence: 'high',
      thumbnailUrl: '...'
    },
      ...
      ],
  pageInfo: {
    limitPerPage: 12,
    totalResults: 20,
    pageExpiredAt: '2024-08-15T04:09:46Z'
    nextPageToken: '...' //This might not exist 
  }
}
Request 4. Search by Page Token

Searching by page token is pretty straightforward. You can use client.search.byPageToken, passing in the pageToken you obtained from the previous request (Request 3). The response looks the same as what you received from the initial search request (Request 3). 

server.js (line 107 - 120)

/** Get search results of a specific page */
app.get(
 "/search/:pageToken",
 asyncHandler(async (req, res, next) => {
   const { pageToken } = req.params;


   let searchByPageResponse = await client.search.byPageToken(`${pageToken}`);


   res.json({
     searchResults: searchByPageResponse.data,
     pageInfo: searchByPageResponse.pageInfo,
   });
 })
);

Client

With the server fully set up, including all the necessary API endpoints, we can now focus on the client-side code. This part of the application is responsible for making requests to the server and handling the data received.

Following the flow established on the server side, we’ll first explore how the app initially displays videos from an index. Afterward, we’ll dive into the functionality for searching videos and paginating the search results.

1 - Showing videos of an index
showVideos function

One of the first functions that run when the page renders is showVideos

script.js (line 434 - 460)

async function showVideos(page = 1) {
 videoList.innerHTML = "";


 ...


 try {
   const { videosDetail, pageInfo } = await getVideoOfVideos(page);


   videoListLoading.removeChild(loadingSpinnerContainer);


   if (videosDetail) {
     videosDetail.forEach((video) => {
       const videoContainer = createVideoContainer(video);
       videoList.appendChild(videoContainer);
     });


     videoListLoading.classList.remove("min-h-[300px]");


     createPaginationButtons(pageInfo, page);
   }
 } catch (error) {
   console.error("Error fetching videos:", error);
 }
}
  • Clears the existing video list in the DOM.
  • Calls getVideoOfVideos, which retrieves the details of all videos on a specific page.
  • Loops through the video details to create and append a container for each video.
  • Finally, it sets up the pagination buttons based on the pageInfo.

getVideoOfVideos Function

The getVideoOfVideos function is responsible for fetching videos of a particular page and then obtaining the details of each video.

script.js (line 523 - 533)

async function getVideoOfVideos(page = 1) {
 const videosResponse = await getVideos(page);


 if (videosResponse.videos.length > 0) {
   const videosDetail = await Promise.all(
     videosResponse.videos.map((video) => getVideo(video.id))
   );


   return { videosDetail, pageInfo: videosResponse.page_info };
 }
}
  • getVideos makes a request to the server to fetch videos for the specified page (as implemented in Request 1 of the Server section).
  • If videos are found, the function loops over each video, calling getVideo to fetch the details (implemented in Request 2 of the Server section).
  • The details are then cached using a simple caching mechanism to optimize subsequent requests.

Creating Video Containers

Once showVideos has the details for each video, it creates a video container based on these details, including the video URL, thumbnail URL, and video title.

Pagination Buttons

Finally, pagination buttons are created based on the total number of pages obtained from getVideos. Each button is set up with an event listener that calls showVideos for the respective page.

script.js (line 498 - 521)

function createPageButton(pageNumber, currentPage) {
 const pageButton = document.createElement("button");
 pageButton.textContent = pageNumber;
  
  ...


 if (pageNumber === currentPage) {
   pageButton.classList.add("bg-slate-200", "font-medium");
   pageButton.disabled = true;
 } else {
   pageButton.classList.add("bg-transparent");
   pageButton.addEventListener("click", () => showVideos(pageNumber));
 }


 return pageButton;
}
2 - Search videos by image

Once a user clicks the “Search” button, the handleSearchButtonClick function is executed to make the search request towards the server. Let’s take a look at how it works step by step.

script.js (line 62 - 89)

async function handleSearchButtonClick() {
 toggleSearchButton(false);


 nextPageToken = null;
 searchResultContainer.innerHTML = "";
 videoListContainer.classList.add("hidden");
 searchResultContainer.classList.remove("hidden");
 searchResultList.innerHTML = "";


 const loadingSpinnerContainer = createLoadingSpinner();
 searchResultContainer.appendChild(loadingSpinnerContainer);


 try {
   const { searchResults } = await searchByImage();


   searchResultContainer.removeChild(loadingSpinnerContainer);


   if (searchResults.length > 0) {
     showSearchResults(searchResults);
   } else {
     displayNoResultsMessage();
   }
 } catch (error) {
   console.error("Error fetching search results:", error);
 } finally {
   toggleSearchButton(true);
 }
}
  • First, the search button is toggled to false to disable it during the search process.
  • Next, the function resets nextPageToken to null and clears any existing content in the searchResultContainer and searchResultList. It also hides the videoListContainer and ensures the searchResultContainer is visible.
  • A loading spinner is then created and displayed to indicate that the search is in progress.
  • The search is executed by calling searchByImage, which makes the search request to the server.
  • After the search is complete, the loading spinner is removed, and the search results are displayed using showSearchResults, or a message is shown if no results are found.
  • Finally, the search button is toggled back to true to re-enable it, allowing the user to perform another search if desired.

3 - Show Search Results by Page Token

If there is more than one page of search results (i.e., if there is a nextPageToken), the createShowMoreButton function will execute to show the user a “Show More” button. This button retrieves and displays the next page of search results. Let's break down how it works, step by step.

script.js (line 296 - 319)

function createShowMoreButton() {
 removeExistingButton();


 const showMoreButtonContainer = document.createElement("div");
 showMoreButtonContainer.id = "show-more-button";
 showMoreButtonContainer.classList.add("flex", "justify-center", "my-4");


 const showMoreButton = document.createElement("button");
 showMoreButton.innerHTML = ' Show More';


 showMoreButton.addEventListener("click", async () => {
   const loadingSpinnerContainer = createLoadingSpinner();
   searchResultContainer.appendChild(loadingSpinnerContainer);


   const nextPageResults = await getNextSearchResults(nextPageToken);


   searchResultContainer.removeChild(loadingSpinnerContainer);


   showSearchResults(nextPageResults.searchResults);
   nextPageToken = nextPageResults.pageInfo.nextPageToken || null;
 });
 showMoreButtonContainer.appendChild(showMoreButton);
 searchResultContainer.appendChild(showMoreButtonContainer);
}
  • First, any existing "Show More" button is removed to prevent duplicate buttons.
  • Next, the function creates a container and the "Show More" button to be displayed.
  • An event listener is added to the button to handle clicks: it shows a loading spinner, retrieves the next page of search results, removes the spinner, displays the new results, and updates the nextPageToken.
  • Finally, the "Show More" button and its container are appended to the searchResultContainer, making it visible to the user.

Conclusion

I hope this post has offered insights into Twelve Labs' recent image-to-video search API and its practical applications. Thank you for reading, and I look forward to seeing how you leverage these advancements in your own projects!

Generation Examples
No items found.
No items found.
Comparison against existing models
No items found.

Related articles

Introducing Marengo 2.7: Pioneering Multi-Vector Embeddings for Advanced Video Understanding

We are excited to announce Marengo 2.7 - a breakthrough in video understanding powered by our innovative multi-vector embedding architecture!

Jeff Kim, Mars Ha, James Le
Introducing Twelve Labs Embed API Open Beta

Introducing our new Embed API in Open Beta, enabling customers to generate state-of-the-art multimodal embeddings.

Manish Maheshwari
Semantic Video Search Engine with Twelve Labs and ApertureDB

Learn how to build a semantic video search engine with the powerful integration of Twelve Labs' Embed API with ApertureDB for advanced semantic video search.

James Le
Building Advanced Video Understanding Applications: Integrating Twelve Labs Embed API with LanceDB for Multimodal AI

Leverage Twelve Labs Embed API and LanceDB to create AI applications that can process and analyze video content with unprecedented accuracy and efficiency.

James Le, Manish Maheshwari