In a paper published Thursday in the journal Science, researchers at New York University found that an AI fed a small portion of one child’s fragmented experience learned that there was something called a crib. They report that they can begin to discern order within pixels. Correctly match the stairs and puzzles with their words and their images.
The tools the researchers used are not AIs that learn like children. However, this study shows that even without pre-existing knowledge about grammar and other social skills, the AI can recognize some of the basic elements of language from sensory input from her one child’s experience. It has been. This is part of a larger quest to eventually build AI that mimics the minds of babies, helping researchers understand our own development and helping humans find ways to develop in a more intuitive way. is the holy grail of cognitive science leading to AI that can teach new skills in
Chatbots, also known as “large-scale language models,” have demonstrated that AI trained on large amounts of text can produce competent conversation partners with incredible language mastery. But many cognitive scientists argue that this verbal feat falls short of actual human thinking.
Babies are the opposite of chatbots, learning words by being in the world itself through sensory input and play, rather than rapidly digesting all the text in the world.
Brenden Lake, a computational cognitive scientist at New York University who led the study, said: “We calculated that it would take a child 100,000 years of listening to spoken language to reach the number of words in a chatbot’s training set.” It will take a while.” “I was skeptical about those things, too.” [chatbot] The model will shed much light on human learning and development. ”
Linguists, philosophers, cognitive scientists, and (increasingly) AI developers Everyone is puzzled about how humans learn languages.
Scientists have been trying for years to understand how children’s minds are formed through carefully controlled experiments. Many use toys and dolls that allow researchers to explore different cognitive skills as they come online. They showed that 16-month-old babies can develop statistical inference to determine whether a noise maker is broken, and that even 5-year-old babies can do so. They know for months that the object is still there even when they can’t see it, an important developmental milestone called object persistence.
Additionally, some babies are closely followed over long periods of time. Deb Roy, a scientist at the Massachusetts Institute of Technology, installed overhead cameras in every room of his home in 2005 to record her son’s language development, generating vast amounts of data documenting the acquisition and evolution of words. provided Mt. The study found that it wasn’t how many times the word was repeated that predicted whether Roy’s son learned the word early, but rather whether the word was in a rare place in the house and at a surprising amount of time. Or whether it was uttered in a unique linguistic context.
The innovative use of head cameras has given researchers an even closer look into early childhood.
Since 2013, several families have contributed to the SAYCam database. This database is a collection of audiovisual recordings from individual infants and toddlers spanning a critical period of cognitive development from 6 to 32 months of age. The babies’ families identify themselves only by name and wear cameras attached to headbands on their children for about two hours a week.
Scientists can apply for access to the data. This data provides a unique window into each child’s world over time and is intended to be a resource for researchers in a variety of fields.
Sam’s identity is not disclosed, but he is currently 11 years old. But records of his early life in Australia provided Lake and his colleagues with his 600,000 video frames combined with his 37,500 words of transcribed training data for an AI project. .
They trained a relatively simple neural network using data obtained when Sam was between 6 months and 2 years old. The results showed that this AI learned to match images to basic nouns with similar accuracy to an AI trained on his 400 million captioned images from around the web.
The results add to, but do not resolve, a long-standing debate in science about the basic cognitive skills that need to be hardwired into the brain for humans to learn language.
There are various theories about how humans learn languages. Renowned linguist Noam Chomsky proposed the idea of innate language ability. Other experts believe that social or inductive reasoning skills are necessary for language to emerge.
New research suggests that some language learning can occur even in the absence of specialized cognitive mechanisms. Relatively simple associative learning (see the ball, hear “ball”) can teach an AI to agree on simple nouns or images.
“There’s nothing built into the network that gives the model any clues about the language or the structure of the language,” said study co-author Wai Keen Vong, a researcher at New York University.
The researchers said they do not have comparable data on how 2-year-olds perform on tasks faced by the AI, but that the AI’s abilities do not match those of young children. For example, we were able to track where the AI focused when prompted with different words, and while it was accurate for some words such as “car” and “ball”, it was Turns out I was looking in the wrong area when the prompt appeared. Cat. “
“We want to find the minimum amount of materials needed to build a model that can learn more, just like children. This is a step,” Lake said.
basics of language
The AI acquired its object vocabulary by being exposed to 1 percent of Sam’s waking hours, or 61 hours of footage accumulated over a year and a half. What intrigued outside scientists about this research was both how far AI has progressed based on that research and how far it still has to go to replicate human learning. .
“Applying these techniques to this kind of data source, data from both the visual and auditory experiences of a single child, is very important and new,” said MIT Professor of Computational Cognitive Science. Joshua Tenenbaum says. work.
“I would like to add that there are still some things that are difficult to conclude exactly from this paper. What this paper can tell us about how children actually learn words. That’s not very clear.”
Michael Tomasello, a developmental and comparative psychologist at Duke University, said the AI model may reflect how dogs and parrots learn words. Experiments have shown that some dogs can learn over 100 words about common objects and stuffed animals.
However, he noted that it remains unclear how this AI can receive sensory input and collect verbs, prepositions, or social expressions.
“We may be able to learn that repetitive visual patterns are ‘puppets’. But how does it learn that that same object is a “toy”? How does it learn “this”, “that”, “that”, and “thing”?” Tomasello said in an email. I wrote this:
He pointed out that AI models trained on children’s experiences were able to identify what they could see, but that was only a small part of the language that children learn by hearing. He proposed an alternative model. By associating images and sounds, AI must infer the intent of the communication in order to learn language.
Lake is starting to train AI models on video instead of still frames See if you can successfully extend your vocabulary to verbs and abstract words. Lake is collecting data from her young daughter, so additional streams of data will be created soon.
But he acknowledged that the way the AI learns even simple words deviates from how children learn. For example, this AI was very good at learning to identify sand, but it was not very good with its hands. In other words, advances in AI probably do not reflect most children’s understanding of their environment.
“‘Sand’ was too easy, but ‘Hand’ was too difficult,” Lake said. “And models don’t know that milk and pears are delicious.”