Author
Meeran Kim
Date Published
October 2, 2024
Tags
API Tutorial
Applications
Developers
Image-to-Video Search
Search API
Semantic Search
Share
Join our newsletter
You’re now subscribed to the Twelve Labs Newsletter! You'll be getting the latest news and updates in video understanding.
Oh no, something went wrong.
Please try again.

Introduction

Meet Crop and Seek, the advanced video search tool that redefines how you discover visual content. Whether you're searching by text or image, Crop and Seek delivers instant results. But what sets it apart is its image cropping feature, allowing you to refine your search in real time.

For image-based queries, upload an image from your device or enter a public image URL. Instantly, the platform retrieves video clips matching the image. Your query image will appear at the top-left corner, where you can click to crop any part of it. Want to focus on a specific detail? Crop and Seek lets you search deeper, finding new results based on your selected area.

Prefer to search with text? Enter any query in the text search field then Crop and Seek delivers relevant video content. 

⭐️ Check out the Demo!

Prerequisites

  • Visit the Twelve Labs Playground, sign up, and generate your API key.
  • Create an index and upload videos into the index. (❗️Make sure you check “Logo” and “Text in Video” under “More options” to maximize the search capabilities!)
  • The app was built with React and Next.js. 
  • The repository containing all the files for this app is available on Github.

‍

Structure of The Application

The Crop and Seek app is built with a server and client architecture. 

On the server side, Next.js handles routing seamlessly, with all routes neatly organized under the app/api folder. Out of the six routes, all but the proxy-image route directly interact with the Twelve Labs API to power the platform’s core functionality.

On the client side, the app is composed of multiple components, which I’ll outline in more detail later. At a high level, there are three key components: Search Bar, Videos, and SearchResults. The Search Bar manages both image and text-based queries, integrating these features into a cohesive search experience.

Since image search is the core feature of Crop and Seek, this tutorial will dive deep into how the image search functionality works, guiding you through its key elements and practical applications.

‍

Server

Twelve Labs API Requests

There are 5 routes that make requests to the Twelve Labs API.

In this tutorial, I will go over how to build the imgSearch route. As there are two types of images that the user can upload (by file upload and public url), the server needs to handle both.

imgSearech/route.js

import { NextRequest, NextResponse } from "next/server";
import axios from "axios";
import FormData from "form-data";


export const runtime = "nodejs";


export async function POST(request) {
 try {
   const formData = await request.formData();
   const apiKey = process.env.TWELVELABS_API_KEY;
   const indexId = process.env.TWELVELABS_INDEX_ID;


   if (!apiKey || !indexId) {
     return NextResponse.json(
       { error: "API key or Index ID is not set" },
       { status: 500 }
     );
   }


   const searchDataForm = new FormData();
   searchDataForm.append("search_options", "visual");
   searchDataForm.append("adjust_confidence_level", "0.55");
   searchDataForm.append("group_by", "clip");
   searchDataForm.append("threshold", "medium");
   searchDataForm.append("sort_option", "score");
   searchDataForm.append("page_limit", "12");
   searchDataForm.append("index_id", indexId);
   searchDataForm.append("query_media_type", "image");


   const imgQuery = formData.get("query");
   const imgFile = formData.get("file");


   if (imgQuery) {
     searchDataForm.append("query_media_url", imgQuery);
   } else if (imgFile && imgFile instanceof Blob) {
     const buffer = Buffer.from(await imgFile.arrayBuffer());
     searchDataForm.append("query_media_file", buffer, imgFile.name);
   } else {
     return NextResponse.json(
       { error: "No query or file provided" },
       { status: 400 }
     );
   }


   const formDataHeaders = searchDataForm.getHeaders();
   const url = "https://api.twelvelabs.io/v1.2/search-v2";


   const response = await axios.post(url, searchDataForm, {
     headers: {
       ...formDataHeaders,
       accept: "application/json",
       "x-api-key": `${apiKey}`,
     },
   });


   const imageResult = response.data;


   if (!imageResult || !imageResult.data) {
     return NextResponse.json(
       { error: "Error getting response from the API" },
       { status: 500 }
     );
   }


   const searchData = imageResult.data;
   const pageInfo = imageResult.page_info || {};


   return NextResponse.json({ pageInfo, searchData });
 } catch (error) {
   console.error("Error in POST handler:", error?.response?.data || error);
   const status = error?.response?.status || 500;
   const message = error?.response?.data?.message || error.message;


   return NextResponse.json({ error: message }, { status });
 }
}

‍

Proxy-image server

You might wonder why we need a proxy-image server. The proxy-image server plays a crucial role in our application.

  1. ‍CORS Handling: It allows cross-origin requests, which is essential for fetching and displaying images from various sources on the client-side. Without this proxy server, the image cannot be fetched and shown in the image cropping area due to CORS (Cross-Origin Resource Sharing) errors.‍
  2. Image Hosting: The proxy server temporarily hosts the image, creating a trusted source from which our application can safely load the image content.‍
  3. Security: By routing image requests through our own server, we add a layer of security, preventing direct exposure of external image sources to the client.

Here's how the proxy-image server works:

  1. When a user provides an image URL, the client sends this to the proxy-image server.
  2. The server fetches the image from the original source.
  3. It then serves this image from its own domain, effectively bypassing CORS restrictions.
  4. The client can now safely load and display the image for cropping and further processing.

‍

Frontend

At a high level, the frontend of Crop and Seek is straightforward. It features a persistent SearchBar at the top. If a user does not submit a text or image query, videos from a predefined index are displayed. When a user submits a text or image query, the search results are shown.

Refer to the component design chart below for a comprehensive overview. This tutorial will focus on the image search functionality, detailing the initial image submission process and how cropping and subsequent searches are handled.

How the initial image search works 

The initial image search is managed by the `SearchByImageButtonAndModal` component. When a user clicks the "Search by image" button, a dialog appears containing an image drop zone and an image link input form.

Although the code may seem complex due to styling elements, the core functionality is straightforward: once an image is submitted either as a file or via a URL, it is processed by the `handleImgSubmit` function. The handleImgSubmit function sets the imgQuery state, which triggers the imgSearch useQuery hook. This hook then invokes fetchImageSearchResults, making a search request to the server with the image data.

Receiving an image as a file 

We are using the useDropzone hook from the react-dropzone library to create a drag-and-drop area for uploading an image file from a user’s device. 

SearchByImageButtonAndModal.js (line 52 - 68)

** Configures the drag-and-drop area for image uploads, handling accepted and rejected files */
 const { getRootProps, getInputProps, isDragAccept } = useDropzone({
   // Specify the types of image files that are allowed
   accept: acceptedImageTypes,
   // Set the maximum file size for uploads (5MB)
   maxSize: MAX_IMAGE_SIZE,
   // Ensure only one file can be uploaded at a time
   multiple: false,
     onDragEnter: () => {
     setErrorCode(undefined);  // Clear any existing error messages 
   },
   onDropAccepted: (files) => {
     handleImgSubmit(files[0]); // Submit the file
     closeModal(); // Close the modal
   },
   // Handle the event when an invalid file is dropped
   onDropRejected: (fileRejections) => {
     // Get the error code for why the file was rejected
     const code = fileRejections[0]?.errors?.[0]?.code;
     if (code) setErrorCode(code);
   },
 });

‍

Receiving an image as a url

The user's input (image URL) is captured via an Input field. The imageUrlFromInput state updates whenever the input value changes. 

SearchByImageButtonAndModal.js (line 187 - 207)

<Input
         className="h-10 border-r-0"
         fullWidth
         placeholder="Drop an image address (not link address)"
         icon={<InsertLink className="text-grey-600" fontSize="small" />}
         value={imageUrlFromInput}
         onSelect={(e) => {
               e.stopPropagation();
             }}
         onChange={(e) => {
               setImageUrlFromInput(e.target.value);
             }}
         onClear={() => setImageUrlFromInput("")}
         type="text"
           />
      <Button
         type="button"
         appearance="primary"
         onClick={handleImageUrl}
         disabled={!imageUrlFromInput}
           >

When the submit button is clicked, the handleImageUrl function trims the URL and calls the handleImageSubmit function. 

SearchByImageButtonAndModal.js (line 70 - 88)

/** Validates the input URL and submits the image if valid */
 const handleImageUrl = () => {
   try {
     const trimmedUrl = imageUrlFromInput
       .trim()
       .replace(/(\.jpg|\.jpeg|\.png).*/i, "$1");
     const isImage =
       /\.(jpg|jpeg|png|gif|bmp|webp)$/i.test(trimmedUrl) ||
       /f=image|f=auto/.test(trimmedUrl);
     if (!isImage) {
       setErrorCode("invalid-url");
       return;
     }
     handleImgSubmit(trimmedUrl);
     closeModal();
   } catch (e) {
     setErrorCode("invalid-url");
   }
 };

‍

Handling a submitted image

handleImgSubmit first resets relevant states and extracts the image name from the source. Then it sets the new image query. When imgQuery is set, it enables the imgSearch useQuery hook to execute fetchImgSearchResults. 

page.js (line 28 - 40)

/** Set image name and query src  */
 const handleImgSubmit = async (src) => {
    // Reset states
   setImgQuery(null);
   setUpdatedSearchData({ searchData: [], pageInfo: {} });
   setTextSearchSubmitted(false);


    // Extract image name from src (URL or File)	
   setImgName(typeof src === "string" ? src.split("/").pop() : src.name);
    // Set new image query
   setImgQuery(src);
 };

fetchImgSearchResults sends a request to the server's /api/imgSearch route, which we discussed in the Server section above. 

SearchResults.js (line 27 - 48)

/** Sends a request to the server to fetch image search results */
 const fetchImgSearchResults = async (imagePath) => {
   // Create a new FormData object to send data to the server
   const formData = new FormData();


   if (imagePath instanceof File) {
     // If the image is a File, append it to FormData with the key "file"
     formData.append("file", imagePath);
   } else {
     // If it's not a File, append it with the key "query"
     formData.append("query", imagePath);
   }


   // Send a POST request to the server's image search API
   const response = await fetch("/api/imgSearch", {
     method: "POST",
     body: formData,
   });


   if (!response.ok) {
     const errorData = await response.json();
     throw new Error(errorData.error || "Network response was not ok");
   }


   return response.json();
 };

‍

How the image cropping and cropped image search works 

The image cropping is managed by the `ImageCropArea` component. When a user clicks the selected image preview in the search bar, a dialog opens displaying the image within the React Crop interface - an image cropping tool for React.

In ImageCropArea, ReactCrop handles user interaction for image cropping. When a user finishes cropping the image and clicks the "Search" button, onCropSearchClick is triggered.

getCroppedImage creates a cropped image using canvas then the cropped image is converted to a File object. Then the image query is set, triggering the imgSearch useQuery hook. This hook then invokes fetchImageSearchResults, making a search request to the server with the image data.

ImageCropArea.js (line 87 - 112)

/** Handles the cropping of an image, converts the crop to a File, and updates state with the cropped image */
 const onCropSearchClick = async () => {
   if (completedCrop && imgRef.current) {
     try {
       // Step 1: Get the cropped image as a base64 string
       const base64Image = await getCroppedImage(
         imgRef.current,
         completedCrop
       );


       // Step 2: Convert the base64 string to a File object
       const croppedImageFile = await base64ToFile(
         base64Image,
         `${imgName}-cropped`
       );


       // Step 3: Update state and close modal 
       if (croppedImageFile) {
         // Update the image query with the new cropped image File
         setImgQuery(croppedImageFile);
         // Update the image name to reflect the cropped version
         setImgName(croppedImageFile.name);
         // Close the crop modal
         closeDisplayModal();
       }
     } catch (error) {
       console.error("Error processing image:", error);
     }
   } else {
     console.warn("No completed crop or imgRef.current is null");
   }
 };

‍

Conclusion

The tutorial has walked through the key components of building this application, focusing on the image search functionality. We've covered how to handle image uploads (both file and URL), process these on the server side, and implement the image cropping feature for refined searches.

We encourage you to explore the full codebase, experiment with the features, and consider how you might extend or adapt this application for your own use cases. Happy coding!

‍

FAQ

Q: When I try to upload an image using public url, I get a following message "The URL you entered does not point to a valid image"

A: When you copy the image url, you should copy the "image address" not the "link address".

Q: What types of image files can I upload for search?

A: Crop and Seek (Twelve Labs Image Search) supports image formats including JPG, JPEG, and PNG. The maximum file size for uploads is 5MB. The dimension of the image should be at least 378 x 378 px. 

Q: Can I combine image and text search?

A: Crop and Seek supports either image or text search at a time. However, you can use the image cropping feature to refine your visual search, which can be a powerful way to focus on specific elements within an image.

Q: Is my uploaded image stored on the server?

A: No, your uploaded images are not permanently stored. They are only used temporarily to process your search request and are not retained after the search is complete.

Generation Examples
No items found.
No items found.
Comparison against existing models
No items found.

Related articles

Building a Shade Finder App: Using Twelve Labs' API to Pinpoint Specific Colors in Videos

Whether you're looking to find the perfect berry-toned lipstick or just curious about spotting specific colors in your videos, this guide will help you leverage cutting-edge AI to do so effortlessly.

Meeran Kim
Building Advanced Video Understanding Applications: Integrating Twelve Labs Embed API with LanceDB for Multimodal AI

Leverage Twelve Labs Embed API and LanceDB to create AI applications that can process and analyze video content with unprecedented accuracy and efficiency.

James Le, Manish Maheshwari
A Recap of Denver Multimodal AI Hackathon

We had fun interacting with the AI community in Denver!

James Le
Advanced Video Search: Leveraging Twelve Labs and Milvus for Semantic Retrieval

Harness the power of Twelve Labs' advanced multimodal embeddings and Milvus' efficient vector database to create a robust video search solution.

James Le, Manish Maheshwari