Newzvia

Technology | OpenAI's GPT-5: AI Now Understands Text, Images, And Sound Together

Pankaj Mukherjee, Senior Technology Correspondent

Pankaj Mukherjee

Senior Technology Correspondent · AI, startups & MeitY policy

3 min read

Quick summary

OpenAI has significantly updated its GPT-5 AI model, allowing it to seamlessly understand and generate content across text, images, and audio. This advancement promises new tools for developers and businesses, including those in India, in the coming weeks.

The next jump in AI capability just arrived. OpenAI, the company behind ChatGPT, today announced a major update to its GPT-5 model. This isn't just a small tweak; it’s a significant shift in how AI works.

The core of this advancement is what’s called 'multimodal' capability. Until now, AI models often specialised: one for text, another for images. Think of it like a human who can only read or only see. With GPT-5, the AI can now understand and create using text, images, and audio all at once.

Imagine giving the AI a picture, a song, and some words, and it processes them together. This means the model can now see a photo, hear its description, and then write a story about it, or even create new images and sounds based on your combined input. OpenAI says this update will roll out to developers and enterprise partners in the coming weeks. Developers use APIs (Application Programming Interfaces) to connect such powerful AI tools into their own apps and services.

More Than Just Words

This move makes AI much more powerful and versatile. It goes beyond just generating text replies or simple image edits. For Indian developers, this could unlock new ideas for applications that need to understand the world in a richer way – from smart assistants that truly hear and see, to creative tools that blend different media forms.

Other tech giants are also pushing similar boundaries. Google DeepMind recently launched AlphaFold 3, a new AI model for scientific discovery, aiming to speed up drug research. Adobe, too, has embedded advanced generative AI tools into its Creative Cloud suite, letting users create complex visuals from simple text prompts. OpenAI's GPT-5 update fits into this broader trend of AIs becoming smarter and more integrated into our digital tools.

What We Don't Know Yet

While the 'dramatically improved' claim sounds exciting, the real test will be how it performs in practice. OpenAI has not yet shared specifics on pricing, detailed benchmarks, or a timeline for wider public access beyond developers and businesses. The impact on diverse Indian languages and local contexts also remains to be seen. These are crucial details for Indian startups looking to build on such foundational models.

This update means developers will soon have a much more capable AI canvas. It will be interesting to see what new products and services emerge, especially from India's vibrant tech scene, as they get access to these advanced tools.


  • OpenAI’s GPT-5 model now uses text, images, and audio together.
  • This 'multimodal' AI can understand and create content across these different types.
  • The update will be available for developers and businesses soon.
  • Many details, like specific pricing or public access dates, are still unclear.

Quick questions

What does 'multimodal' mean for GPT-5?
AI processes and generates text, images, and audio concurrently.
How will this help developers?
Yes — Developers can use this API to create richer AI apps that blend various media types, enabling new, more nuanced tools and understandings of the world.
When can common users expect GPT-5?
Still unclear: OpenAI expects developer and enterprise rollout soon. No public release date.
Does this impact India?
Global developers, including those in India, will access these tools, fostering new local AI applications. India-specific details remain unannounced.
Newzvia·4 Jul 2026

Google's Gemini Ultra 2.0: A Step Up for AI Developers?

Google DeepMind has unveiled Gemini Ultra 2.0, its latest and most powerful artificial intelligence model. It promises better understanding and code creation, aiming to give Indian developers stronger tools for new apps and services.
Read article
Newzvia·2 Jul 2026

Google's Gemini Pro 1.5: Bigger Brain for AI Builders

Google has made its new Gemini Pro 1.5 AI model available to software developers, offering improved understanding of different data types and a much larger memory for complex tasks. This move could empower Indian startups and tech firms to build more advanced artificial intelligence tools.
Read article
Newzvia·30 Jun 2026

Quantum AI's 'Nexus' Aims to Speed Up Science, But Who Gets Access?

Quantum AI Systems today unveiled 'Nexus', a new platform designed to accelerate scientific discoveries using different types of data. This multimodal AI tool could reshape research, but details on its reach for Indian scientists remain unclear.
Read article
Newzvia·28 Jun 2026

QuantumAI's Quasar 2.0: A Step for AI in Software

QuantumAI today released its new Quasar 2.0 AI model, claiming big leaps in understanding complex information across different types of data. For Indian developers, this could mean better tools for building software and more natural interactions with AI.
Read article
Newzvia·26 Jun 2026

Microsoft's Copilot OS: AI Deeply Woven into Windows 12

Microsoft has announced a major shift for Windows 12, baking its Copilot AI assistant directly into the operating system. This move aims to automate tasks and create content system-wide, with specific details for Indian users still awaited.
Read article
Newzvia·24 Jun 2026

Google DeepMind's Gemini Pro 2: Faster AI for Developers

Google DeepMind has rolled out Gemini Pro 2, its newest AI model, offering better 'multimodal' features and quicker responses. Indian developers can now use this tool to build smarter applications and services.
Read article

More from categories

Business

View all

Technology

View all

Sports

View all