Ivo Bozukov: The Rise of Agentic AI and Smarter Models

Generative AI once meant asking a question and getting an answer. ChatGPT could write essays and even generate complex code, but it couldn’t plan, execute multiple tasks, or learn from what happened along the way.

All that is changing. A new generation of AI systems is emerging with capabilities that look less like clever text generators and more like autonomous problem-solvers.

From Assistants to Agents

The biggest shift is something called agentic AI. Unlike earlier tools that waited for instructions, these systems work through complicated tasks by splitting them into steps and adjusting as they go. 

Microsoft’s Copilot handles email sorting and meeting notes for workers at nearly 70% of Fortune 500 companies. Newer versions also resolve IT issues, answer benefits questions, and create reports without someone checking every output.

The difference comes down to autonomy. Early AI tools were reactive. Agentic systems are proactive, anticipating what comes next and doing it.

Models That Actually Reason

Behind these agents are models with fundamentally different capabilities. Large language models used to predict what word should come next in a sequence. Now they’re being trained to reason through problems step by step.

This involves something called chain-of-thought prompting. Instead of generating one answer, the model generates multiple responses and evaluates which approach works best. It’s more expensive because you’re running multiple inferences, but the model reasons rather than pattern-matches.

These reasoning models need far more computing power than earlier generations.

Multimodal Capabilities Change Everything

Text was the only input early models could handle. Multimodal systems process text, images, video, and audio together. 

A customer service issue might involve a photo of a damaged product, a voice message explaining what happened, and an email with the order details. Multimodal models can look at all three simultaneously and respond with full context.

Models like GPT-5.1 and Gemini 2.5 Flash demonstrate what’s now possible. They interpret screenshots, understand voice notes, and generate responses that account for every piece of input. Ivaylo Bozoukov’s work in the energy transition involved exactly this kind of multimodal data challenge, where operational decisions require synthesizing sensor readings, visual inspections, maintenance logs, and real-time performance metrics.

As Ivo Bozukov puts it, “Multimodal AI is where things get truly useful for industry, because real operations aren’t made of text alone. When models can combine images, sensor data, logs, and human messages into one coherent picture, decisions get faster and less fragmented. That’s the difference between AI that demos well and AI that runs real systems.”

What Makes Frontier Models Different

Tech companies describe their most advanced systems as “frontier models” because they push boundaries in natural language processing, image generation, and coding. These models don’t just respond better. They remember more, reason longer, and handle complexity that would have broken earlier versions.

Memory is crucial here. Frontier models maintain context across longer interactions, building on previous exchanges rather than resetting. This enables “multi-step workflows” – tasks involving sustained planning across many actions, not just answering single questions. 

The Infrastructure Reality

As Ivaylo Bozoukov had seen whilst working in the energy sector in Texas, all of this requires massive computational resources. Reasoning models use far more processing power than simple generative models because they’re running multiple inference passes. As these capabilities become standard, demand for advanced semiconductors and data center capacity continues climbing.

Organizations face a choice between off-the-shelf models for general tasks or custom models trained on specific data that perform better but cost significantly more. Most successful implementations combine both: frontier models for complex reasoning, smaller models for routine operations, and human oversight where stakes are high.

You may also like