How to leverage advances in AI to query data locally using natural language on your own PC.
Over the past few months, I’ve written numerous times about both advances in AI and the ease of using AI. With the emergence of really good, fine-tuned local models, more and more things are becoming possible every day. Here I would like to take a very practical and achievable example. This is about querying your own data (CSV/Excel/SQL). 100% local system. Whether you’re running Mac, Linux, or Windows, you’ll probably get decent performance with 8GB of RAM.
In early February 2024, defog released SQLCoder-7b-2.it’s a big language A fine-tuned model for writing SQL queries. The 7b parameter means it’s small enough to run on consumer hardware. There’s no need to invest in extremely expensive GPUs or hundreds of gigabytes of RAM.
Additionally, software like LM Studio makes it easy to: Run local models and communicate with them on my computer.
Once you have the model and the ability to run it, you need the final piece of the puzzle to put it together to create an interface. Introducing Streamlit. Open–sauce Python framework It’s for machine learning and data science and works very well with Generative AI to create interfaces to query data. I will provide you with the code so you can play with it.
So the three pieces of this puzzle are:
- SQLCoder-2 7b — Also available in 15b and 70b models, if you have the hardware.
- LM studio
- stream light
Important note:
SQLCoder-2 is purpose-built for writing SQL queries. This model is not built to have conversations back and forth. This means you can ask questions, let them create queries, apply them directly and see the results.
Start by getting your model up and running in LM Studio.
Loading the model
LM Studio is easy to install and use, but optimization can be a bit difficult as it is highly dependent on your hardware and what it can and cannot handle. This also applies to alternatives. It takes a little trial and error, but in the end…