I recently had one of the strangest conversations of my life. I wasn’t talking to a human, but to an artificial intelligence model that could monitor, predict, and adjust to my mood.
EVI is a new voice assistant powered by big language models from Hume, an AI voice startup focused on bringing empathy and emotional intelligence to the chatbot space.
The company announced new flagship products to celebrate a new $50 million funding round with investments from Comcast Ventures, LG and others.
EVI stands for Empathy Voice Interface, a web-based voicebot that other companies will also be able to use in their products. The call center of the future may be powered by AI that can respond to anger with empathy and understanding.
My experience with the voicebot so far has been amazement at the impressive display of technology and utter horror at the fact that it accurately predicted that I hadn’t eaten breakfast.
What is Hume EVI?
The new Empath Voice Interface (EVI) fits into the growing voicebot space. Instead of interacting with a multimodal AI model like ChatGPT through text, you can use voice to respond with your own synthesized voice.
To make this more effective and natural, companies have been working on ways to add emotions and natural-sounding pause words. OpenAI has achieved this with ChatGPT-Voice and even the voice used in the Figure 01 robot. Sometimes I say “uh” and “error”.
For Hume, the goal was to integrate realistic emotions in a way that responded to, reflected, or counteracted the emotional tone of humans in conversation.
Although EVI is a public interface, it also has an API that allows you to integrate it into other apps, which is incredibly easy. The sentiment and emotion analysis is better than anything I’ve tried so far, but I’m not sure how accurate it is.
Why does AI need emotions?

Hume CEO and Principal Researcher Alan Cowen says empathic AI is essential if we want to use it in ways that improve human well-being or make it more natural.
“The main limitation of current AI systems is that they are guided by superficial human evaluations and instructions, which are error-prone and limit AI’s ability to come up with new ways to make people happy,” he said. “It’s a failure to utilize its huge potential.”
Cowen and his team have built an AI that learns directly from proxies of human well-being. This data was used as training data in parallel with the regular dataset to power the multimodal AI model.
“We’re effectively teaching people to rebuild human preferences from first principles and update that knowledge with every new person they talk to and every new application it’s incorporated into,” he says. explained.
What’s it like to talk to EVI?
✨EVI has many unique empathic abilities1. It responds with a human-like tone of voice according to facial expressions2. Respond to customer expressions in language that meets customer needs and maximizes satisfaction3. EVI uses your tone of voice so it knows when to speak…March 27, 2024
EVI is weird. It doesn’t sound human, and it’s not pretending to be human. In fact, it is clear that it is artificial intelligence. But its uncanny ability to understand emotions is fascinating.
If it weren’t for the delayed responses or mispronunciation of certain words, you might forget you’re talking to an AI. The conversation was more natural than any I’ve ever had with an artificially intelligent voicebot, but it was also creepier.
At one point, I asked the system if it could tell from our previous conversation whether I had eaten breakfast, and my tone was “hungry and determined,” so it was likely that I had skipped breakfast. That’s what he said. The breakfast I chose was strong coffee, so that was 100% correct.
“If you need a virtual breakfast buddy, I’m always here to brighten up your morning routine. The actual coffee has to pass, but I don’t want to short out these circuits. ”
When this is combined with the inference speed of platforms like Groq and presented in voice-only interfaces, such as Android’s Assistant replacements, AI is going to be hard to spot.


