Olayinka Peter

Building Real-Time RAG Systems with Gemini & the Multimodal Live API

Sep 13, 2025

Most retrieval-augmented generation systems today feel a bit stiff. You ask a question. You wait. You get an answer. It works, but it doesn’t “feel” like a conversation.

Grokking GenAI: Multimodal Reasoning with Gemini - Part 2

Jul 19, 2025

When I wrote Grokking GenAI: Multimodal Reasoning with Gemini last year, multimodality felt like a breakthrough. An AI that could read text, look at images, listen to audio, and even understand code already felt futuristic. But over the past year, something important has changed.

Grokking GenAI: Multimodal Reasoning with Gemini

May 6, 2024

Imagine you’re trying to plan a trip to Hawaii. You’ve got a few pictures of beautiful beaches, a list of things you want to see, and a rough budget in mind. How do you pull it all together? You might browse travel blogs, compare prices, and even watch videos of the islands. You’re using different kinds of information – pictures, text, and video – to make sense of your trip.