AI developers have long been working to realize the digital personal assistant by providing a service that is smart, easy to use, and always available. Gemini Live, announced at Made by Google earlier this week, is Google’s latest attempt to achieve this goal. So we put the AI to the test for 24 hours to see how close it gets to being truly useful.
While I’m not used to chatting directly with an AI assistant beyond asking it to set a timer while I cook, I wanted to see what benefits there are to having an open-ended conversation with an AI like Gemini, and after a day of testing, I’m at least more confident in the value of talking to an AI in this way, even if I don’t have much faith in some of the answers it currently gives me.
While my experience with Gemini Live is far from a formal test of its features, the breadth of questions I was asked gave me a good impression of what works and what doesn’t, so I’m confident in my assessment that Gemini Live will be a good addition to the Gemini package and a good enough reason to motivate some of our free users to become paying Gemini Advanced users for $20 per month, even if they haven’t achieved all their goals yet.
Thursday Afternoon — Setup
Gemini Live is offered as part of the Gemini Advanced subscription, but as of this writing it is still being rolled out and not yet available to all users. Luckily, I had a Google Pixel 9 Pro XL so I was able to try it out with it. If you want to know more about this smartphone, check out our hands-on review of the Google Pixel 9 Pro and Pro XL. Here we will focus only on Gemini Live.
Another issue is that Gemini currently requires that you set your language to US English. Luckily, even after doing this, I was still able to select a UK voice called “Capella” for Gemini Chat from the 10 voices they offer. They all sound very natural, with different levels of enthusiasm and pitch. When I started asking questions, I was rarely met with a particularly bad mispronunciation or oddly phrased sentence.
Thursday Night — Home
After everything was set up, my first significant interaction with Gemini Chat was to ask for directions to get home. Gemini Live initially wouldn’t tell me what it found after I told it my mode of transportation of choice and confirmed my destination. After a lengthy wait, I told it to actually tell me what it found, and it walked me through the route.
I probably would have got home that way, but it wouldn’t have been a smooth journey. Gemini appears to have misidentified one of the train lines and one of the stations, not realising that the one I changed technically required me to walk between two stations, and made up a completely fabricated train route. This is all strange, as Gemini claimed to have checked the information on the Transport for London website.
This is more of a problem with the underlying AI model than with Gemini Live, but having an authoritative voice (with a British accent) giving directions could lead to people unfamiliar with London’s public transport system becoming lost – in this case, you’d be better off using Google Maps.
Friday Morning — Breaking News
The next day, while getting ready for work, I asked Gemini to tell me the latest news of the day. With just one prompt, it told me a lot about the host changes at Good Morning Britain and This Morning, plus a quick mention of the recent stabbing in Leicester Square. But when I asked about tech news, things got even weirder.
Google Gemini initially told me that Microsoft was announcing the Surface Duo 3, but the device has never been confirmed and has in fact been rumored to be canceled for months. The PS5 Slim is real, but it was released last fall, so I can assume the final comment is a reference to last month’s Crowdstrike outage.
We then asked Gemini Live to focus on iPhone rumors, but all of their initial responses were about the currently available iPhone 15 lineup. When pressed further, they also explained some of the iPhone 16 camera rumors, but didn’t go into much detail.
Friday Morning — Brewing Guide
After a few hours of work, it was time for a coffee break so I asked Gemini Live to show me how to brew a V60 pourover.
I was hoping to get step-by-step instructions from the AI, but the problem here is that I have to continually instruct and interrupt Gemini Live, effectively forcing it to give me a step-by-step answer. However, while the transcript shows that the AI initially misheard my instructions, it continued the conversation and provided a compelling answer.
In terms of knowledge, Gemini was a mixed bag. There were some geeky tips, like filtering your water before boiling it. The overall recipe was simple, but it produced a drinkable cup of coffee. But Gemini Live gave me the weight of the coffee in tablespoons, not grams or ounces, which aren’t the amounts typically used when brewing coffee. But the extra prompts helped me figure out the amount in grams.
Friday Lunchtime — Fighting Talk
I had some free time over lunch so I chatted a bit with Gemini Live about the game I’m currently playing the most: Street Fighter 6. This year it was announced exactly who Street Fighter 6’s champion and opponents will be at Evo 2024, but again not many early details were revealed.
I steered the conversation towards training advice (I tend to be overly reliant on certain moves) and got some suggestions on how to rethink my approach in matches – easier said than done when your opponent is hurling fireballs at you, but still valid advice.
I also tried to get advice on where I could meet them in person, but this also didn’t work out too well. I tried to check the official website for more information, but it didn’t list anything other than the official Capcom tournaments. I then found a Facebook group near me, but it didn’t provide a link to access it later in the transcript.
Friday Afternoon — Writing Advice
“Imagine having a real-life assistant you can call that can help you instantly, without you having to type or send a photo. That’s the idea behind Gemini Life. Gemini Life lets you talk to an AI assistant just like you would talk to a friend. I’ll be testing it out over the next 24 hours to see if it can make the dream of a personal assistant a reality.”
Gemini Live’s final intro proposal
For my final Gemini task, I decided to tackle the meta. And I’m not talking about Llama 3. I asked for help drafting the introduction to this article.
After previous experiences where Gemini had offered few details in responses, I was amazed at how Gemini now suggested specific wordings. When I asked for more information or a change of perspective, it responded logically. And, as Google proudly pointed out in its Made by Google demo, Gemini Live also handles interruptions, allowing you to adjust your response on the fly.
This is Gemini Live’s best feature: It feels totally natural to repeat ideas out loud, even when you’re talking into a glowing waveform on your phone. After all, I wrote the opening part of this article from scratch, but if you scroll up and compare it to what Gemini Live presented to me, you’ll see an echo of my final proposal.
Google Gemini Live: Final thoughts
You might get the impression from this article that I don’t think highly of Gemini Live, but that’s completely wrong. My nastiest criticism is aimed at the Gemini Advanced model it’s running, which seemed to misunderstand what it was asking of me in some of my testing scenarios. Interestingly, a recent Gemini vs Gemini Advanced showdown showed that I might have been better off sticking with the basic Gemini.
Meanwhile, Gemini Live itself really impressed me. Being able to have an ongoing conversation with a chatbot, as long as you’re willing to be specific and interrupt when it strays, seems like a much better way to interact than via text or image prompts. You can ask a regular digital assistant follow-up questions, too, but it’s not as seamless as Gemini Live. And that seamlessness is what makes it useful, answering questions and offering guidance not just hands-free, but eyes-free, so you can focus on other things while you’re talking to the chatbot.
But the big question remains how this compares to the upcoming ChatGPT Voice, since Gemini Live interprets voice as text and then responds, whereas ChatGPT Voice can process voice directly. But even taking into account the general caveats around AI, it feels like Google is on the right path in pursuing its digital personal assistant dream.