How a tiny startup beat the tech giants of the world and ranked #1 in video search
Author
Aiden Lee
Aiden Lee
Date Published
Mar 16, 2022
Tags
Team
Startup
Share
Join our newsletter
You’re now subscribed to the Twelve Labs Newsletter! You'll be getting the latest news and updates in video understanding.
Oh no, something went wrong.
Please try again.

Final ICCV VALUE Challenge 2021 — Video Retrieval(Search) Track Result (2021.10)

One question I often got from our customers and investors is

“How does your technology compare to Google’s or Microsoft’s?”

I’m sure what they REALLY wanted to ask is…

“Is your technology better than Google’s or Microsoft’s?”

It’s a difficult question to answer. Even more difficult for a deep tech AI startup, especially if the founders do not have a strong track record of publications or come from academia. The answer usually ends up in one of two routes,

  1. The bulldozer strategy: “Yes, we are better! Here is their technology’s benchmark performance and here is ours.”
    → Reaction: doubtful, questioning, and sometimes even resentful
  2. The sidestep strategy: “We provide better usability and can build out features for specific customer segments, and our customers love it!” (Talking about products and customers instead of technology)
    → Reaction: maybe persuaded, but still dissatisfied.

We always went with #2 despite having had a better benchmark performance than other companies. It gave a natural segue for us to talk more about our customers, and more importantly, we definitely did not want to argue with the folks we were trying to turn into believers in our product and vision!

As a technical founder who’s leading the AI research and product development, I was often discouraged. Each time I heard the same question repeated, again and again, I felt powerless to the point of having nervous breakdowns. I constantly repeated the word “sorry” to my team who worked day and night to build the incredible technology I knew we had.

Feeling powerless. At least the chair was comfy.

That’s when I knew we had to participate in the ICCV VALUE(Video-And-Language-Understanding-Evaluation) Challenge hosted by Microsoft. The challenge had already started two weeks earlier, but who cares? This was the perfect opportunity to prove ourselves.

Three reasons,

  1. The task of the challenge was spot on for us — video retrieval — that evaluates the performance of video search AI models.
  2. The evaluation would be objective and complete, with four different and diverse domains of benchmark video datasets.
  3. It was hosted and joined by the most prestigious AI institutions and tech giants such as Microsoft, Tencent, and Baidu, giving us the chance to directly compete against them.

If we could win the competition, there would be much to gain: credibility, branding, PR, hiring, confidence, …and most importantly, we would have a powerful, bulldozer answer to give to our customers and investors when asked something along the lines of, “Are you better than Google?”

Despite the shiny opportunities that we imagined if we could win the competition, the odds were so obviously against us.

  1. We had limited cloud GPU resources that we could utilize for training multiple models at the same time. At the time, we only had $50K to spare for the competition. We had been given $100K worth of free AWS credit upon joining Techstars, and had already used up $50K. For a competition of this size, $50K in compute is same as not having compute at all.
  2. We had limited human “labor”. Our entire company consisted of fewer than 10 people. 10 people minus the non-engineers minus the engineers who had to focus on product and PoC tasks with beta customers...? That only left me with 2 engineers, and that’s including myself.
  3. We had limited datasets that we could use to train our model. Unlike the tech giants who own infinite amounts of videos to pre-train their models, our only option was to utilize public video datasets that are available to everyone.

And so, I believed that we had less than a 10% chance of winning, but still decided to participate. Just like any startup at some point, we needed to take a leap of faith and arm ourselves with a winning mentality. As the startup saying goes, the odds will always be 0% if we don’t do anything.

Next Post: Part 2 — Nuts & Bolts of ICCV VALUE Challenge

Generation Examples
No items found.
No items found.
Comparison against existing models
No items found.

Related articles

Twelve Labs Secures $30 Million in Funding, Validating the Importance of Twelve Labs' Video Understanding Technology to the AI Ecosystem

We are thrilled to announce a significant milestone in our journey—a strategic investment of $30M from global leaders like Databricks, SK Telecom, Snowflake, HubSpot Ventures, and IQT (In-Q-Tel).

Twelve Labs
Twelve Labs
Twelve Labs is building AI that can analyze and search through videos

Startup Twelve Labs is building models and tools to help companies search through and analyze video content.

TechCrunch
Kyle Wiggers
Our SOC 2 Type 2 Certification

Twelve Labs has successfully completed its SOC 2 Type 2 audit, marking a significant milestone in our commitment to data security and privacy.

Ulises Cardenas
A Recap of Our Multimodal AI in Media & Entertainment Hackathon in Sunny Los Angeles!

Twelve Labs co-hosted our first in-person hackathon in Los Angeles!

James Le