Image credits: Yuichiro Chino/Getty Images
Remember the Chinese “spy” balloon from 2023? If not, here’s a refresher. About a year ago, a high-altitude balloon from China flew over American airspace almost undetected. The balloon was then discovered and shot down by the U.S. Air Force, and curious civilians tracked its origins until AI companies like Synthetaic used satellite imagery to show it was possible. That turned out to be difficult.
Fortunately, the balloon story provided a powerful product demonstration opportunity for Synthetaic and attracted the attention of investors, including defense contractor Booz Allen Hamilton.
This week, Synthetaic raised $15 million in a Series B round co-led by Lupa Systems and TitletownTech. TitletownTech is a venture capital firm that has partnered with the Green Bay Packers and Microsoft, with participation from IBM Ventures and the aforementioned Booz Allen Hamilton. Synthetaic’s total raised is $32.5 million, and CEO Corey Jaskolski said the new cash will help accelerate the commercialization of the company’s computer vision technology and nearly double Synthetaic’s headcount to 80 people by the end of the year. The amount will be allocated to
“The amount of image data being generated is increasing rapidly, underscoring the growing demand for advanced AI solutions to manage and analyze this vast amount of information,” said Jaskolski. He told TechCrunch in an email interview. “We find that gaining insights from these vast amounts of data remains a major challenge and priority for many industries such as defense, geospatial, video security, and drone-based surveillance. Synthetaic Unsupervised AI solutions in learning and data analysis enable us to strategically navigate the evolving technology landscape.”
Jaskolsky, an MIT graduate and former director of technology at National Geographic, is an adventurous type.he Scuba diving among icebergs in Antarctica, descending 12,500 feet below sea level to investigate the wreckage of the Titanic, leading a helicopter-based effort to map the Napoleonic side of Everest, and discovering the sacrifices of Mayan human sacrifice. He ventured deep into a flooded cave while creating an inventory of the Ice Age. bear skeleton.
So why did a death-defying globetrotter like Jaskolski found Synthetaic? It’s quite simple, he says. They observed that AI had the potential to help classify the world’s information, but were hampered by the need to manually annotate data.
“Human labeling is the standard for AI training,” Jaskolski says. “As AI models get bigger, their performance improves, but they also require more and more data to train because they have more and more internal parameters to tune. For a long time, industry solutions to this problem The trick was to train the AI by having literally millions of people draw boxes on top of things, but what if you didn’t need the human-labeled data?”
Founded in 2019, Synthetaic offers the following tools: Rapid Automatic Image Categorization (RAIC for short) – Designed to automate the analysis of large, unlabeled data sets, i.e. satellite images and videos.
Many AI models are trained by having a group of people (annotators) label data, allowing the model to learn to associate specific annotations (labels) with characteristics of the data. For example, a model fed a large number of pictures of cats annotated with each breed will eventually “learn” to differentiate between a bobtail and a shorthair.
In contrast, RAIC uses synthetic data (data for which labels are automatically generated) to train the model.
In the case of the Chinese balloon, this allowed Synthetaic’s platform to discover the balloon using only a sketch of what the balloon would look like from space and recent satellite images from the area where the balloon was shot down. is completed.
“RAIC means you can process rare and complex datasets, accelerate AI development, and improve predictive modeling without being constrained by data quantity or quality.” Jaskolsky said. “This positions RAIC as a strategic asset to drive innovation, operational efficiency, and competitive advantage, especially in use cases where data is the bottleneck to AI adoption and implementation. ”
Synthetaic is not the only company considering the use of synthetic data in model training.
Synthesis AI, which raised $17 million in a venture round in April 2022, is developing a platform that generates synthetic data to train various types of AI systems. Two years ago, Scale AI launched a program that allows machine learning engineers to enrich existing real-world datasets with synthetic samples. Other companies, like Parallel Domain, are creating synthetic data for specific use cases such as self-driving cars.
Gartner predicts that 60% of the data used to develop AI and analytics projects will be synthetically generated by 2024. But as the industry moves forward, some experts worry that the shortcomings of synthetic data, and potential dangers, are being ignored.
In a January 2020 study, researchers at Arizona State University showed that an AI system trained on a professor’s image dataset was able to create highly realistic faces. However, most of the faces were white and male. This system amplified the biases in the original dataset, which unsurprisingly included predominantly male and white professors.
Synthetaic customers aren’t afraid to take risks because it’s worth it.
The startup is collaborating with the U.S. Air Force to test AI-powered object detection in geospatial data and working with nonprofit environmental organization The Nature Conservancy to identify bird species previously thought to be extinct. claims to have identified. Synthetaic also has a contract with AFWERX, an Air Force research agency, to develop technology for object labeling, AI modeling, and object detection in satellite images.
Mr. Jaskolski believes that RAIC has applications in countless other domains, from AI prototyping to drone-based monitoring and content moderation. Noting that Synthetaic has teamed up with CNN to analyze war images from Gaza and partnered with Planet Labs to sell analytics based on Earth imagery data, Synthetaic’s business is tied to the technology industry. It claims to be resilient against downturns and broader macroeconomic headwinds.
“Synthetaic’s technology provides an innovative approach to training and creating AI models, addressing critical needs of technical decision makers.” Jaskolsky said. “For executives, Synthetaic’s RAIC means they can process rare and complex datasets, accelerate AI development, and improve predictive modeling without being constrained by data quantity or quality. This positions RAIC as a strategic asset to drive innovation, operational efficiency, and competitive advantage, especially in use cases where data is the bottleneck to AI adoption and implementation. ”