Guides

Coming soon

Market insights

Coming soon

Search

Personalize

0%

Google’s new Gemini model: Analyzing hour-long videos, but few can use it

2 mins

Daniil Bazylenko

Published by: Daniil Bazylenko

16 February 2024, 08:50PM

In Brief

Google, along with researchers, found a way to boost GenAI models' data processing capabilities, allowing them to handle millions of words.

Gemini 1.5 Pro is Google's latest GenAI model, surpassing its predecessor, Gemini 1.0 Pro, in data processing capacity.

While Gemini 1.5 Pro can handle about 700,000 words, the version available to most users can only process around 100,000 words at once and is in an "experimental" phase.

Larger context windows in AI models enable better understanding of conversations and more nuanced responses.

Google is the first to offer a commercially available model with a context window of up to 1 million tokens, with potential applications ranging from code analysis to video content comparison.

Last October, a research paper was published by a Google data scientist, the CTO of Databricks Matei Zaharia, and UC Berkeley professor Pieter Abbeel. It suggested a way to allow GenAI models — like OpenAI’s GPT-4 and ChatGPT — to handle much more data than before. They found that by fixing a memory bottleneck, models could process millions of words, a big jump from the previous limit of hundreds of thousands.

AI research moves quickly.

Today, Google announced Gemini 1.5 Pro, the latest addition to its Gemini family of GenAI models. It's a step up from Gemini 1.0 Pro and can process a lot more data.

Gemini 1.5 Pro can handle about 700,000 words or 30,000 lines of code — 35 times more than Gemini 1.0 Pro. It’s not just for text either; it can also handle up to 11 hours of audio or an hour of video in different languages.

But there's a catch.

The version of Gemini 1.5 Pro available to most users can only process around 100,000 words at once. It’s called “experimental” and is limited to approved developers and customers for now.

Google’s Oriol Vinyals sees it as a big achievement.

“When you interact with these models, the longer and more complex your questions are, the more context the model needs to deal with,” Vinyals said. “We’ve unlocked long context in a massive way.”

Context matters a lot.

Models with small context windows forget recent information quickly, leading to off-topic responses. Larger context windows can understand conversations better and give richer responses.

Others have tried large context windows before.

Magic claimed to have a model with a 5 million-token context window. And a group from Meta, MIT, and Carnegie Mellon found a way to remove the limit on context window size.

But Google is the first to offer a commercially available model with such a big context window. Gemini 1.5 Pro's max context window is 1 million tokens, and the more common version has 128,000 tokens, like OpenAI’s GPT-4 Turbo.

So, what can you do with a 1 million-token context window? Google says a lot — like analyzing code, understanding long documents, having long conversations with chatbots, and comparing content in videos.

During a demo, Gemini 1.5 Pro successfully handled tasks, but not very quickly. It took between 20 seconds and a minute for each task, much longer than a ChatGPT query.

Google says they’re working on speeding it up and testing a version with a 10 million-token context window.

The model also brings other improvements. It uses a new architecture to match the quality of Gemini Ultra, Google’s top GenAI model.

Pricing details aren’t clear yet. During the preview, Gemini 1.5 Pro with the 1 million-token context window will be free, but later on, there will be different pricing tiers.

There are questions about how this affects other models in the Gemini family, like Gemini Ultra. Will they get similar upgrades? It’s a bit confusing right now, but Google is likely working through it.

User Comments

There are no reviews here yet. Be the first to leave review.

Hi, there!

Join our newsletter

Stay in the know on the latest alpha, news and product updates.