Twelve Labs introduces a robust evaluation framework for video understanding, emphasizing both appearance and motion analysis.
Our video-language foundation model, Pegasus-1. gets an upgrade!
This blog post introduces Marengo-2.6, a new state-of-the-art multimodal embedding model capable of performing any-to-any search tasks.
This article introduces the suite of video-to-text APIs powered by our latest video-language foundation model, Pegasus-1.
This post will give a brief definition of embeddings, walk through various unimodal embeddings, explore multimodal video embeddings, and glance at embeddings in production.
Applications, Principles, and Core Research Challenges in Multimodal AI
A review of how far video understanding research has come, what potential remains untapped, and where it is headed in the future
Capabilities and Applications of Foundation Models in Layman Terms
A primer on foundation models: what they are, how they've evolved, and where they're going.