According to MinIO CEO and co-founder Anand Babu “AB” Periasamy, generative AI is a fundamental breakthrough with far-reaching implications for computing. But GenAI’s biggest impact is reminding businesses of their most important asset: their data, he said.
There’s no denying that GenAI has caused a stir over the past 14 months. From warning of human extinction to predicting a $7 trillion economic impact, GenAI has been in the spotlight for better or worse.
Although some of the fanfare is clearly unwarranted, GenAI will not replace all workers with digital robots. It also captures the imagination of the world’s leading engineers. Among those very impressed with what GenAI has shown so far is Periasamy, who co-founded the open source object storage company MinIO and before that created his distributed file system Gluster. It will be.
“GenAI is actually a real fundamental breakthrough,” Periasamy said. data nami In a recent interview. “I think this is the most important advance in all of computing. It will take two to three years to see a significant impact, but the impact will be huge.”
Many of the startups that have appeared around GenAI are full of excitement. But just as the dot-com boom and subsequent flameout created the fertile ground from which advanced web technologies eventually sprouted, today’s GenAI revolution will ultimately lead to a paradigm shift in how technology is used. He said it would make a difference.
“The breakthrough is real,” Periasamy said. “There’s going to be a lot of hype. There’s going to be a lot of startups going out of business within a couple of years. But just like we saw the real dot-com effect benefits after the bubble burst, the same thing is going to happen. I think it’s going to happen here too.”
Create new value from data
Currently popular GenAI applications are mainly chatbots and co-pilots. As ChatGPT has shown, conversations with GenAI can last for hours or even days. Also, his popular GenAI Copilot, a GitHub offering that allows you to write boilerplate code, is soothing the excitement of developers tired of the same old routines.
But the biggest impact of GenAI will be unlocking the value locked away in data, Periasamy said.
“Every company has their own data, and they’re starting to realize that they can procure a software stack and tweak a data store (the data store on MinIO) to mine that data without having to hire data science or engineering. “, he said. “All the data that we currently store in the object store will be available very quickly, which was not possible before.”
Only the biggest companies, with names like Anthropic and OpenAI, develop large-scale language models (LLMs). Periasamy said larger (but still relatively small) groups of companies will take the next step and fine-tune their existing LLMs based on their own data.
But the real sweet spot for GenAI will be found in companies that connect internal data to open source LLMs using less sophisticated techniques such as prompted engineering or search extension generation (RAG), he said.
“You can take these basic models and run them without any training or fine-tuning or even hiring a single data scientist in your organization,” they wrote in 2018. Announced. data nami said the person being watched. “Because once vectorized, [your data], we can now understand that knowledge and incorporate it on top of the underlying data. That’s your organization’s expert. ”
Getting started with GenAI requires very little technical skill. Periasamy says anyone who can write basic Python scripts can figure out how to connect data to his LLM using RAG techniques and prompt engineering. A key step is to vectorize your business data and make it accessible to your LLM. The most difficult part of it, he said, is creating a vector index.
Treating blockages
Perhaps the biggest hurdle for GenAI over the past year has been getting a GPU. Production GenAI systems are processor-intensive, increasing demand for Nvidia’s high-end GPUs. Some large companies stock them up and finding them in the cloud can be difficult.
“The advantage of GPUs is that they have huge graphics memory, which is necessary to hold large models,” Periasamy says. “If you have a small model, you can run it on a CPU. But for a large model you need an H100, A100 GPU.”
The good news is that GPU bottlenecks are starting to ease, Periasamy said. As Intel and AMD succeed in mass-deploying mid-range GPUs, he said there will be pressure on Nvidia to lower prices and ease the overall market.
When that finally happens (Periasamy estimates the GPU squeeze will start to ease later this year), how will companies make the most of all the unstructured data they’ve shoved into object stores over the years? The competition will begin to see if you can do it.
“The game will be about who has the most valuable data and how to use it. This is where companies will get a big boost,” Periasamy said. “All the data now stored in the object store is available very quickly.”
MinIO is already playing a central role in all of this on several levels. As an S3-compatible object storage system that can store hundreds of petabytes in the cloud or on-premises, MinIO already stores large amounts of unstructured data that will eventually be run through LLM. It is also used to store vector embeddings for vector databases such as Milvus.
Periasamy is not a company that adds new features for MinIO. This directly reflects the minimalist approach of the object store. “We are an anti-roadmap company,” he said. “If he asks me to remove a feature, I’m happy to remove it. For me to add a new feature, he has to convince me why MinIO is incomplete without it.” .”
Nevertheless, new capabilities are being developed to accommodate GenAI. Although the details are still vague, it is likely that MinIO will receive add-ons that allow it to perform functions that facilitate GenAI.
When Periasamy founded MinIO in 2014, he said it was his intention to “solve storage” for unstructured data. But the storage solution was just the first step in his plan to tackle bigger problems and provide bigger solutions, including deep learning and enabling his AI on large amounts of unstructured data. . Given the current breakthroughs seen in the adoption of unstructured data with GenAI and his MinIO, things appear to be progressing strictly according to Periasamy’s original plan.
Related products:
Are databases becoming just query engines for big object stores?
MinIO, now worth $1 billion, but still hungry for data
For Minio CEO Periasamy, storage issues are just the beginning
AB Periasamy, fine-tuning, GenAI, generative AI, GPU squeezing, large-scale language models, LLM, midrange GPUs, object storage, prompt engineering, RAG, unstructured data