Left to right: Founding engineers Michael Opara and Lauren Kim. Alan Cowen, Founder and CEO. Janet Ho, COO and chief scientist, is building “emotionally intelligent” AI models.
fume
Alan Cowen looks disappointed. “My dog died this morning,” he said to his AI model from startup Hume. The AI model claims to detect over 24 different emotional expressions in a person’s voice (ranging from nostalgia to awkwardness to anxiety) and react accordingly.
“I am so sorry to hear of your passing. Losing a pet is never easy,” the AI replied with sympathy and disappointment in the voice of Hume’s creative producer Matt Forte. Ta.
Cowen, a former Google researcher, founded Hume in 2021 to build “emotionally intelligent” conversational AI that can interpret emotions and generate appropriate responses based on how people speak. . Since then, more than 1,000 developers and his 1,000 companies, including SoftBank and Lawyer.com, have used Hume’s API to understand the wide range of emotions contained in human speech through aspects such as rhythm, tone, and timbre. We’ve been building AI-based applications that can pick up and measure signals. Voices, sighs, “ummm” and “ahh” etc.
“The future of AI interfaces will be voice-based, as voice is four times faster and can convey twice as much information as typing,” Cowen said. Forbes. “But to take advantage of that, you really need a conversational interface that captures more than language.”
The New York-based startup on Wednesday raised a $50 million Series B funding round led by Swedish investment firm EQT Ventures, with participation from Union Square Ventures and angel investors Nat Friedman and Daniel Gross. announced that it had procured. The influx of new capital brings the startup’s value to $219 million.
The company is also looking to help developers integrate into existing products and create apps that can detect nuances in speech and text expressions and adjust words and tone to produce “emotionally attuned” output. The company also announced the release of Hume EVI, a conversational voice API it builds. A.I. For example, if an AI detects sadness or anxiety in a user’s voice, it will respond with hints of sympathy and “empathic pain” in its own verbal responses.
These empathic responses are not entirely new.when forbes When we tested OpenAI’s ChatGPT Plus with the same prompt (“My dog died this morning”), it produced nearly identical verbal responses to Hume. But the startup aims to differentiate itself with its ability to identify underlying expressions.
To that end, Hume’s in-house large-scale language and text-to-speech models are trained on data collected from more than 1 million participants in 30 countries. This includes millions of human interactions and self-reported data from videos and participants who interacted with them. He interacts with other participants, Cowen said. The database’s demographic diversity helps the model learn cultural differences and make it “explicitly unbiased,” he said. “Our data shows that less than 30% of people are white.”
“The future of AI interfaces will be voice-based, as voice is four times faster and can convey twice as much information as input.”
Hume uses in-house models to interpret emotional tone, but for more complex content it relies on external LLMs such as OpenAI’s GPT 3.5, Anthropic’s Claude 3 Haiku, and Microsoft’s Bing Web Search API. Generates a response within 700 milliseconds. The 33-year-old CEO said Hume’s technology is built to mimic the style and rhythm of human conversation, and can detect when a person interrupts the AI to stop the conversation, as well as He said they can also tell when it’s their turn to speak. He also sometimes pauses and even chuckles when he speaks, which is a little disconcerting when he hears the audio coming from the computer.
Although Hume’s technology appears to be more sophisticated than previous types of emotion detection AI that relied on facial expressions, it can be used to detect complex, multidimensional emotional expressions through voice and text of all kinds. Using AI is an imperfect science, even Hume’s AI admits. is one of his biggest challenges. Emotional expression is highly subjective and is influenced by various factors such as gender, social and cultural norms. Research shows that even when AI is trained on diverse data, using it to interpret human facial expressions can yield biased results.
When asked about the hurdles AI would have to overcome to have human-like conversations, he said AI has difficulty responding to “emotions, context, and linguistic nuances.” “Accurately interpreting tone, intent, and emotional cues in real time is a complex task.”
Hume’s AI isn’t always accurate either.when forbes tested When asked questions such as “What should I have for lunch?” Hume’s AI detected five other expressions, including “boredom” and “interest” and “determination.”
Cowen, who has published more than 30 research papers on AI and emotional science, advised Facebook in 2015 on how to change its recommendation algorithm to prioritize people’s health, which can detect and measure human facial expressions. He said it was the first time he realized the need for a tool. -There is.
Hume’s AI is being integrated into applications in industries such as health and wellness, customer service and robotics, Cowen said. For example, the online lawyer directory Lawyer.com uses Hume’s AI to measure the quality of customer service calls and train agents.
In the healthcare and wellness space, use cases are at a more early stage. Stephen Heissig, a researcher at the Icahn School of Medicine, the medical school of the New York-based Mount Sinai Health System, is using Humean AI models to assess the mental health of patients with conditions such as depression and borderline personality disorder. He said he is conducting experimental research to track health status. “Deep brain stimulation therapy” is a treatment method that involves implanting electrodes into the patient’s brain. (The study only accepts patients for whom other treatments have not worked, he said.) His AI model at Hume analyzes how patients are feeling and whether their treatments are working day-to-day. is used to detect. Heissig said Hume’s AI could be used by psychiatrists to provide more information about emotions that are difficult to detect.
“Patients participating in DBS studies keep two video diaries a day. They have sessions with psychologists and psychiatrists, and we record them and use Hume’s model. “We use it to characterize facial expressions and speech prosody,” Heissig said. Forbes.
Hume’s model is also integrated into Dot, a productivity chatbot that helps people plan and reflect on their day. Samantha Whitmore, co-founder of New Computer, an early-stage startup developing chatbots backed by OpenAI, said Hume’s AI provides “enhanced context” about people’s emotions. .
“If we detect a level of stress or frustration, we might say, ‘Looks like we have a lot to do. We should think about ways to make this more manageable,'” she says. “It helps us understand what kind of mental state they’re in.”
More from Forbes:


