Youâre now subscribed to the Twelve Labs Newsletter! You'll be getting the latest news and updates in video understanding.
Oh no, something went wrong. Please try again.
Introduction
Meet Crop and Seek, the advanced video search tool that redefines how you discover visual content. Whether you're searching by text or image, Crop and Seek delivers instant results. But what sets it apart is its image cropping feature, allowing you to refine your search in real time.
For image-based queries, upload an image from your device or enter a public image URL. Instantly, the platform retrieves video clips matching the image. Your query image will appear at the top-left corner, where you can click to crop any part of it. Want to focus on a specific detail? Crop and Seek lets you search deeper, finding new results based on your selected area.
Prefer to search with text? Enter any query in the text search field then Crop and Seek delivers relevant video content.Â
Create an index and upload videos into the index. (âď¸Make sure you check âLogoâ and âText in Videoâ under âMore optionsâ to maximize the search capabilities!)
The app was built with React and Next.js.Â
The repository containing all the files for this app is available on Github.
â
Structure of The Application
The Crop and Seek app is built with a server and client architecture.Â
On the server side, Next.js handles routing seamlessly, with all routes neatly organized under the app/api folder. Out of the six routes, all but the proxy-image route directly interact with the Twelve Labs API to power the platformâs core functionality.
On the client side, the app is composed of multiple components, which Iâll outline in more detail later. At a high level, there are three key components: Search Bar, Videos, and SearchResults. The Search Bar manages both image and text-based queries, integrating these features into a cohesive search experience.
Since image search is the core feature of Crop and Seek, this tutorial will dive deep into how the image search functionality works, guiding you through its key elements and practical applications.
â
Server
Twelve Labs API Requests
There are 5 routes that make requests to the Twelve Labs API.
In this tutorial, I will go over how to build the imgSearch route. As there are two types of images that the user can upload (by file upload and public url), the server needs to handle both.
You might wonder why we need a proxy-image server. The proxy-image server plays a crucial role in our application.
âCORS Handling: It allows cross-origin requests, which is essential for fetching and displaying images from various sources on the client-side. Without this proxy server, the image cannot be fetched and shown in the image cropping area due to CORS (Cross-Origin Resource Sharing) errors.â
Image Hosting: The proxy server temporarily hosts the image, creating a trusted source from which our application can safely load the image content.â
Security: By routing image requests through our own server, we add a layer of security, preventing direct exposure of external image sources to the client.
Here's how the proxy-image server works:
When a user provides an image URL, the client sends this to the proxy-image server.
The server fetches the image from the original source.
It then serves this image from its own domain, effectively bypassing CORS restrictions.
The client can now safely load and display the image for cropping and further processing.
â
Frontend
At a high level, the frontend of Crop and Seek is straightforward. It features a persistent SearchBar at the top. If a user does not submit a text or image query, videos from a predefined index are displayed. When a user submits a text or image query, the search results are shown.
Refer to the component design chart below for a comprehensive overview. This tutorial will focus on the image search functionality, detailing the initial image submission process and how cropping and subsequent searches are handled.
How the initial image search worksÂ
The initial image search is managed by the `SearchByImageButtonAndModal` component. When a user clicks the "Search by image" button, a dialog appears containing an image drop zone and an image link input form.
Although the code may seem complex due to styling elements, the core functionality is straightforward: once an image is submitted either as a file or via a URL, it is processed by the `handleImgSubmit` function. The handleImgSubmit function sets the imgQuery state, which triggers the imgSearch useQuery hook. This hook then invokes fetchImageSearchResults, making a search request to the server with the image data.
Receiving an image as a fileÂ
We are using the useDropzone hook from the react-dropzone library to create a drag-and-drop area for uploading an image file from a userâs device.Â
** Configures the drag-and-drop area for image uploads, handling accepted and rejected files */
const { getRootProps, getInputProps, isDragAccept } = useDropzone({
// Specify the types of image files that are allowed
accept: acceptedImageTypes,
// Set the maximum file size for uploads (5MB)
maxSize: MAX_IMAGE_SIZE,
// Ensure only one file can be uploaded at a time
multiple: false,
onDragEnter: () => {
setErrorCode(undefined); // Clear any existing error messages
},
onDropAccepted: (files) => {
handleImgSubmit(files[0]); // Submit the file
closeModal(); // Close the modal
},
// Handle the event when an invalid file is dropped
onDropRejected: (fileRejections) => {
// Get the error code for why the file was rejected
const code = fileRejections[0]?.errors?.[0]?.code;
if (code) setErrorCode(code);
},
});
â
Receiving an image as a url
The user's input (image URL) is captured via an Input field. The imageUrlFromInput state updates whenever the input value changes.Â
/** Validates the input URL and submits the image if valid */
const handleImageUrl = () => {
try {
const trimmedUrl = imageUrlFromInput
.trim()
.replace(/(\.jpg|\.jpeg|\.png).*/i, "$1");
const isImage =
/\.(jpg|jpeg|png|gif|bmp|webp)$/i.test(trimmedUrl) ||
/f=image|f=auto/.test(trimmedUrl);
if (!isImage) {
setErrorCode("invalid-url");
return;
}
handleImgSubmit(trimmedUrl);
closeModal();
} catch (e) {
setErrorCode("invalid-url");
}
};
â
Handling a submitted image
handleImgSubmit first resets relevant states and extracts the image name from the source. Then it sets the new image query. When imgQuery is set, it enables the imgSearch useQuery hook to execute fetchImgSearchResults.Â
/** Sends a request to the server to fetch image search results */
const fetchImgSearchResults = async (imagePath) => {
// Create a new FormData object to send data to the server
const formData = new FormData();
if (imagePath instanceof File) {
// If the image is a File, append it to FormData with the key "file"
formData.append("file", imagePath);
} else {
// If it's not a File, append it with the key "query"
formData.append("query", imagePath);
}
// Send a POST request to the server's image search API
const response = await fetch("/api/imgSearch", {
method: "POST",
body: formData,
});
if (!response.ok) {
const errorData = await response.json();
throw new Error(errorData.error || "Network response was not ok");
}
return response.json();
};
â
How the image cropping and cropped image search worksÂ
The image cropping is managed by the `ImageCropArea` component. When a user clicks the selected image preview in the search bar, a dialog opens displaying the image within the React Crop interface - an image cropping tool for React.
In ImageCropArea, ReactCrop handles user interaction for image cropping. When a user finishes cropping the image and clicks the "Search" button, onCropSearchClick is triggered.
getCroppedImage creates a cropped image using canvas then the cropped image is converted to a File object. Then the image query is set, triggering the imgSearch useQuery hook. This hook then invokes fetchImageSearchResults, making a search request to the server with the image data.
/** Handles the cropping of an image, converts the crop to a File, and updates state with the cropped image */
const onCropSearchClick = async () => {
if (completedCrop && imgRef.current) {
try {
// Step 1: Get the cropped image as a base64 string
const base64Image = await getCroppedImage(
imgRef.current,
completedCrop
);
// Step 2: Convert the base64 string to a File object
const croppedImageFile = await base64ToFile(
base64Image,
`${imgName}-cropped`
);
// Step 3: Update state and close modal
if (croppedImageFile) {
// Update the image query with the new cropped image File
setImgQuery(croppedImageFile);
// Update the image name to reflect the cropped version
setImgName(croppedImageFile.name);
// Close the crop modal
closeDisplayModal();
}
} catch (error) {
console.error("Error processing image:", error);
}
} else {
console.warn("No completed crop or imgRef.current is null");
}
};
â
Conclusion
The tutorial has walked through the key components of building this application, focusing on the image search functionality. We've covered how to handle image uploads (both file and URL), process these on the server side, and implement the image cropping feature for refined searches.
We encourage you to explore the full codebase, experiment with the features, and consider how you might extend or adapt this application for your own use cases. Happy coding!
â
FAQ
Q: When I try to upload an image using public url, I get a following message "The URL you entered does not point to a valid image"
A: When you copy the image url, you should copy the "image address" not the "link address".
Q: What types of image files can I upload for search?
A: Crop and Seek (Twelve Labs Image Search) supports image formats including JPG, JPEG, and PNG. The maximum file size for uploads is 5MB. The dimension of the image should be at least 378 x 378 px.Â
Q: Can I combine image and text search?
A: Crop and Seek supports either image or text search at a time. However, you can use the image cropping feature to refine your visual search, which can be a powerful way to focus on specific elements within an image.
Q: Is my uploaded image stored on the server?
A: No, your uploaded images are not permanently stored. They are only used temporarily to process your search request and are not retained after the search is complete.
Whether you're looking to find the perfect berry-toned lipstick or just curious about spotting specific colors in your videos, this guide will help you leverage cutting-edge AI to do so effortlessly.
Leverage Twelve Labs Embed API and LanceDB to create AI applications that can process and analyze video content with unprecedented accuracy and efficiency.