This tutorial is co-authored with the wonderful team at MindsDB: Zoran Pandovski (Senior Full Stack Engineer), Martyna Slawinska (Software Engineer and Technical Writer), and Minura Punchihewa (Data Integrations Engineer)!
β
In recent years, there have been incredible advancements in machine learning models. However, integrating and serving these models into applications can still be challenging for developers. We're excited to share a tutorial showcasing how you can make building powerful ML apps far easier by combining Twelve Labs' state-of-the-art foundation models for video understanding with MindsDB's platform for building custom AI. Video is everywhere on the internet, but unlike text, it's harder to parse, summarize, and understand. In the tutorial below, you can build a tool that lets you do all these things.
Twelve Labs offers access to powerful multimodal foundation models that can understand natural language prompts and generate human-like text from video content. MindsDB, enables access to machine learning models and data sources for building customized AI solutions. Combining these two technologies opens up a world of possibilities for developers.
Whether you want to generate text, classify your content, or search for specific moments in your videos, this integration makes Twelve Labs' capabilities accessible to any developer. We hope you'll find this tutorial informative and helpful in building powerful ML apps.
Throughout this tutorial, you'll learn how to configure the Twelve Labs integration in MindsDB, deploy the Twelve Labs model within MindsDB (it will use the Twelve Labs summarization tasks), and automate the whole flow through a Slack bot that will periodically post the video summarizations as announcements.
By the time you complete this tutorial, you'll have created a Slack bot that leverages the strengths of MindsDB and Twelve Labs to deliver concise video summaries posted as announcements on Slack channels.
Let's get started and explore the limitless possibilities that this integration has to offer.
β
β
To run the MindsDB Docker container, execute the following command:
While the MindsDB container is running, access its shell by executing:
Inside the Docker container, install the additional dependencies for TwelveLabs and Slack integrations using the following command:
After installation, you can access MindsDB's web interface at http://127.0.0.1:47334 using your browser.
β
To retrieve the Twelve Labs API key, log in to the Dashboard page, locate your API key, and select the Copy icon to copy it to your clipboard.
Twelve Labs recommends using environment variables to pass configuration to your application. However, for this tutorial, you can provide your API key as a parameter to MindsDB, as seen below, while creating an ML engine.
β
Before diving into the summarization task, you need to establish a connection between MindsDB and Twelve Labs. This involves creating a Machine Learning (ML) Engine that utilizes the Twelve Labs API. To initiate this integration, copy your API key and execute the following SQL command to create the Twelve Labs engine:
β
With the ML engine ready, the next phase involves creating a model to handle the summarization task:
The parameters that we have provided in the USING statement are:
'twelve_labs_engine'
that you created before.'pegasus1'
for the summarization task.'visual'
and 'conversation'
.'summary'
.Note, depending on the video size, it may take some time to create the index and store the video data (video embeddings and metadata). Once the above command is executed, you can run the describe statement to check the model status by executing:
If everything works, you should see status as complete in the STATUS column. In case of an error, check the ERROR column, which contains detailed information about the error.
Before we generate a summarization from the video, we need to know the video_id. To get that, run the DESCRIBE statement for models indexed_videos:
The returned result set should contain the video_id. Copy the id and query the model to get the key points for the video.
β
With the video ID, you can now extract the summarization:
Now, alongside result_id and video_id, you should be able to see the result_summary column which contains the actual summarization text.
How cool is this? With just a few SQL statements, we were able to get a clear summary of the video. Letβs take this even forward and create a Slack Bot that will post the summaries in the Slack channel.
β
Before connecting MindsDB to your Slack workspace, make sure you generate a token by following up on the MindsDB Slack documentation.
Once you have the token run the following command:
Then, verify the integration by posting a welcome message to your Slack channel:
Now, on the announcement channel you should be able to see the above message.
If that works, you can use the Twelve Labs model and post the summarizations on the channel:
This query will call the demo_day_model, get the summarization text, and use the demo_day_bot to post it on the announcement channel.
β
Congratulations! By completing this tutorial, you have successfully created a Slack bot that harnesses the power of MindsDB and Twelve Labs to deliver concise video summaries as announcements on Slack channels.
You learned how to:
The complete GitHub repository can be found here: https://github.com/mindsdb/mindsdb/tree/staging/mindsdb/integrations/handlers/twelve_labs_handler.
This integration showcases the ease of unlocking cutting-edge foundation models from Twelve Labs within your applications by leveraging MindsDB's AI-SQL capabilities. With just a few lines of SQL, you could access state-of-the-art video understanding and seamlessly incorporate it into a useful Slack bot.
To further explore the potential of MindsDB and Twelve Labs, check out these additional resources:
We encourage you to experiment with other Twelve Labs models and MindsDB integrations to build even more powerful video AI applications. The possibilities are endless!
If you have any questions or feedback, feel free to reach out to the MindsDB and Twelve Labs communities. Happy building!
Learn how to build a semantic video search engine with the powerful integration of Twelve Labs' Embed API with ApertureDB for advanced semantic video search.
Whether you're looking to find the perfect berry-toned lipstick or just curious about spotting specific colors in your videos, this guide will help you leverage cutting-edge AI to do so effortlessly.
Leverage Twelve Labs Embed API and LanceDB to create AI applications that can process and analyze video content with unprecedented accuracy and efficiency.
Harness the power of Twelve Labs' advanced multimodal embeddings and Milvus' efficient vector database to create a robust video search solution.