Meet Crop and Seek, the advanced video search tool that redefines how you discover visual content. Whether you're searching by text or image, Crop and Seek delivers instant results. But what sets it apart is its image cropping feature, allowing you to refine your search in real time.
For image-based queries, upload an image from your device or enter a public image URL. Instantly, the platform retrieves video clips matching the image. Your query image will appear at the top-left corner, where you can click to crop any part of it. Want to focus on a specific detail? Crop and Seek lets you search deeper, finding new results based on your selected area.
Prefer to search with text? Enter any query in the text search field then Crop and Seek delivers relevant video content.
⭐️ Check out the Demo!
The Crop and Seek app is built with a server and client architecture.
On the server side, Next.js handles routing seamlessly, with all routes neatly organized under the app/api folder. Out of the six routes, all but the proxy-image route directly interact with the Twelve Labs API to power the platform’s core functionality.
On the client side, the app is composed of multiple components, which I’ll outline in more detail later. At a high level, there are three key components: Search Bar, Videos, and SearchResults. The Search Bar manages both image and text-based queries, integrating these features into a cohesive search experience.
Since image search is the core feature of Crop and Seek, this tutorial will dive deep into how the image search functionality works, guiding you through its key elements and practical applications.
There are 5 routes that make requests to the Twelve Labs API.
In this tutorial, I will go over how to build the imgSearch route. As there are two types of images that the user can upload (by file upload and public url), the server needs to handle both.
You might wonder why we need a proxy-image server. The proxy-image server plays a crucial role in our application.
Here's how the proxy-image server works:
At a high level, the frontend of Crop and Seek is straightforward. It features a persistent SearchBar at the top. If a user does not submit a text or image query, videos from a predefined index are displayed. When a user submits a text or image query, the search results are shown.
Refer to the component design chart below for a comprehensive overview. This tutorial will focus on the image search functionality, detailing the initial image submission process and how cropping and subsequent searches are handled.
The initial image search is managed by the `SearchByImageButtonAndModal` component. When a user clicks the "Search by image" button, a dialog appears containing an image drop zone and an image link input form.
Although the code may seem complex due to styling elements, the core functionality is straightforward: once an image is submitted either as a file or via a URL, it is processed by the `handleImgSubmit` function. The handleImgSubmit function sets the imgQuery state, which triggers the imgSearch useQuery hook. This hook then invokes fetchImageSearchResults, making a search request to the server with the image data.
We are using the useDropzone hook from the react-dropzone library to create a drag-and-drop area for uploading an image file from a user’s device.
SearchByImageButtonAndModal.js (line 52 - 68)
The user's input (image URL) is captured via an Input field. The imageUrlFromInput state updates whenever the input value changes.
SearchByImageButtonAndModal.js (line 187 - 207)
When the submit button is clicked, the handleImageUrl function trims the URL and calls the handleImageSubmit function.
SearchByImageButtonAndModal.js (line 70 - 88)
handleImgSubmit first resets relevant states and extracts the image name from the source. Then it sets the new image query. When imgQuery is set, it enables the imgSearch useQuery hook to execute fetchImgSearchResults.
fetchImgSearchResults sends a request to the server's /api/imgSearch route, which we discussed in the Server section above.
SearchResults.js (line 27 - 48)
The image cropping is managed by the `ImageCropArea` component. When a user clicks the selected image preview in the search bar, a dialog opens displaying the image within the React Crop interface - an image cropping tool for React.
In ImageCropArea, ReactCrop handles user interaction for image cropping. When a user finishes cropping the image and clicks the "Search" button, onCropSearchClick is triggered.
getCroppedImage creates a cropped image using canvas then the cropped image is converted to a File object. Then the image query is set, triggering the imgSearch useQuery hook. This hook then invokes fetchImageSearchResults, making a search request to the server with the image data.
ImageCropArea.js (line 87 - 112)
The tutorial has walked through the key components of building this application, focusing on the image search functionality. We've covered how to handle image uploads (both file and URL), process these on the server side, and implement the image cropping feature for refined searches.
We encourage you to explore the full codebase, experiment with the features, and consider how you might extend or adapt this application for your own use cases. Happy coding!
Q: When I try to upload an image using public url, I get a following message "The URL you entered does not point to a valid image"
A: When you copy the image url, you should copy the "image address" not the "link address".
Q: What types of image files can I upload for search?
A: Crop and Seek (Twelve Labs Image Search) supports image formats including JPG, JPEG, and PNG. The maximum file size for uploads is 5MB. The dimension of the image should be at least 378 x 378 px.
Q: Can I combine image and text search?
A: Crop and Seek supports either image or text search at a time. However, you can use the image cropping feature to refine your visual search, which can be a powerful way to focus on specific elements within an image.
Q: Is my uploaded image stored on the server?
A: No, your uploaded images are not permanently stored. They are only used temporarily to process your search request and are not retained after the search is complete.
We are excited to announce Marengo 2.7 - a breakthrough in video understanding powered by our innovative multi-vector embedding architecture!
Introducing our new Embed API in Open Beta, enabling customers to generate state-of-the-art multimodal embeddings.
Learn how to build a semantic video search engine with the powerful integration of Twelve Labs' Embed API with ApertureDB for advanced semantic video search.
Whether you're looking to find the perfect berry-toned lipstick or just curious about spotting specific colors in your videos, this guide will help you leverage cutting-edge AI to do so effortlessly.