Newzvia

Artificial Intelligence | Google DeepMind Upgrades Gemini 1.5 Pro with Massive Context Window

Pankaj Mukherjee, Senior Technology Correspondent

Pankaj Mukherjee

Senior Technology Correspondent · AI, startups & MeitY policy

3 min read

Quick summary

Google DeepMind significantly updated its Gemini 1.5 Pro model on , introducing a 2-million-token context window and native audio understanding. These enhancements are set to broaden the scope of generative AI applications globally, including for developers and businesses in India who rely on advanced AI models.

Google DeepMind Upgrades Gemini 1.5 Pro with Massive Context Window

Google DeepMind expanded its Gemini 1.5 Pro model's context window to 2 million tokens and added native audio understanding on , to advance multimodal generative AI capabilities, according to the company's announcement.

What Happened / Key Details

Google DeepMind announced significant updates to its Gemini 1.5 Pro model, markedly expanding its 'context window' to an unprecedented 2 million tokens. This advancement allows the large language model (LLM) to process and analyze substantially more information in a single query, encompassing vast amounts of text, code, or data, as stated by the company.

In addition to the expanded context window, Gemini 1.5 Pro now features native audio understanding capabilities. This means the model can directly process and analyze audio inputs alongside existing text and video formats. This integration facilitates more complex multimodal interactions, where the AI can interpret and respond to a blend of spoken language, written text, and visual information simultaneously.

Official Position / Company Statement

According to Google DeepMind, these advancements are designed to push the boundaries of multimodal generative AI, enabling the development of highly complex and sophisticated applications. The company expressed its intent for Gemini 1.5 Pro to handle more intricate, real-world scenarios by integrating diverse data types more seamlessly.

Context / Background

The field of generative artificial intelligence and large language models (LLMs) is currently a highly competitive and rapidly evolving landscape. This update from Google DeepMind positions Gemini 1.5 Pro at the forefront of models capable of processing extensive data inputs, a critical factor for enterprise-level applications and complex research tasks. Such advancements have significant implications for the global AI ecosystem, including Indian developers and businesses exploring the potential of generative AI across various sectors.

This development follows other significant announcements in the AI space. Recently, Microsoft and OpenAI announced a deepened partnership aimed at developing AI for scientific research, including drug discovery and material science, making advanced AI tools available to researchers. Concurrently, Anthropic officially launched 'Claude 4,' its next-generation LLM, which also boasts improved complex reasoning, coding abilities, and a more sophisticated understanding of multimodal inputs, including images and video.

Key Takeaways

  • Google DeepMind enhanced its Gemini 1.5 Pro model with a 2-million-token context window.
  • The model now offers native audio understanding, allowing direct processing of audio inputs alongside text and video.
  • These updates aim to expand the potential for multimodal generative AI in complex applications.
  • The advancements contribute to the ongoing global competition among major AI developers, including those impacting the AI adoption landscape in India.

People Also Ask

What is a context window in an LLM?
A context window in a large language model (LLM) refers to the maximum amount of input data (like text or code) the model can consider at once to generate a response. A larger context window allows the AI to understand and process longer conversations, documents, or entire videos, maintaining coherence over extended interactions.

What does 'multimodal generative AI' mean?
Multimodal generative AI describes artificial intelligence systems that can understand, process, and generate content across multiple data types simultaneously. This includes combinations of text, images, video, and now audio, enabling more versatile and human-like interactions and content creation.

How will native audio understanding benefit AI users?
Native audio understanding allows AI models to directly interpret spoken language, environmental sounds, or music without prior transcription. This capability can enhance voice assistants, enable real-time analysis of podcasts or meetings, and improve accessibility features by allowing AI to directly respond to audio cues.

What is the significance of 2 million tokens for Gemini 1.5 Pro?
A 2-million-token context window is a significant leap, enabling Gemini 1.5 Pro to handle extremely large datasets, such as entire books, lengthy research papers, or full-length movies, within a single prompt. This vastly improves the model's ability to summarize, analyze, and generate insights from complex, extensive information.

Newzvia·14 Jun 2026

InnovateAI Launches AuraVerse Multimodal AI Platform

InnovateAI Corp. has released AuraVerse, a new generative AI platform that creates text, images, and video content. Specific details for Indian users, including pricing and availability, are not yet public.
Read article
Newzvia·12 Jun 2026

Google Gemini Ultra 2.0: Smarter AI Sees, Hears; What About India?

Google today launched Gemini Ultra 2.0, an advanced large language model designed to understand complex video and audio inputs better. The model's potential impact and availability for Indian users and developers are still awaiting specific details.
Read article
Newzvia·9 Jun 2026

EU AI Act Gets First Real Rules: What Indian Tech Should Watch

The European Commission has released its first set of technical standards for high-risk AI systems, a crucial step for the EU's landmark AI Act. This move sets a precedent that Indian developers selling to Europe, and policymakers here at home, will need to study closely.
Read article
Newzvia·4 Jun 2026

Google's Gemini Ultra 2.0 Arrives: Who Gets It?

Google DeepMind just released its most advanced AI model, Gemini Ultra 2.0, promising better understanding and problem-solving. But like many cutting-edge AI tools, its access for Indian users and developers remains limited for now.
Read article
Newzvia·2 Jun 2026

Gemini 2.0 Arrives: What Google Claims, What's Missing

Google DeepMind today launched Gemini 2.0, its latest AI model with big promises for better reasoning and code. But specific details for Indian users and developers remain unsaid.
Read article
Newzvia·30 May 2026

Google's Gemini Apex: New AI Model, Old Questions

Google DeepMind today launched Gemini Apex, an advanced large language model that understands video, audio, and text in real-time. But critical details like pricing for India and training data transparency remain unclear.
Read article

More from categories

Business

View all

Technology

View all

Sports

View all