Close Menu
5gantennas.org5gantennas.org
  • Home
  • 5G
    • 5G Technology
  • 6G
  • AI
  • Data
    • Global 5G
  • Internet
  • WIFI
  • 5G Antennas
  • Legacy

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

4 Best Wi-Fi Mesh Networking Systems in 2024

September 6, 2024

India is on the brink of a new revolution in telecommunications and can lead the world with 6G: Jyotiraditya Scindia

August 29, 2024

Speaker Pelosi slams California AI bill headed to Governor Newsom as ‘ignorant’

August 29, 2024
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram
5gantennas.org5gantennas.org
  • Home
  • 5G
    1. 5G Technology
    2. View All

    Deutsche Telekom to operate 12,500 5G antennas over 3.6 GHz band

    August 28, 2024

    URCA Releases Draft “Roadmap” for 5G Rollout in the Bahamas – Eye Witness News

    August 23, 2024

    Smart Launches Smart ZTE Blade A75 5G » YugaTech

    August 22, 2024

    5G Drone Integration Denmark – DRONELIFE

    August 21, 2024

    Hughes praises successful private 5G demo for U.S. Navy

    August 29, 2024

    GSA survey reveals 5G FWA has become “mainstream”

    August 29, 2024

    China Mobile expands 5G Advanced, Chunghwa Telecom enters Europe

    August 29, 2024

    Ateme and ORS Boost 5G Broadcast Capacity with “World’s First Trial of IP-Based Statmux over 5G Broadcast” | TV Tech

    August 29, 2024
  • 6G

    India is on the brink of a new revolution in telecommunications and can lead the world with 6G: Jyotiraditya Scindia

    August 29, 2024

    Vodafonewatch Weekly: Rural 4G, Industrial 5G, 6G Patents | Weekly Briefing

    August 29, 2024

    Southeast Asia steps up efforts to build 6G standards

    August 29, 2024

    Energy efficiency as an inherent attribute of 6G networks

    August 29, 2024

    Finnish working group launches push for 6G technology

    August 28, 2024
  • AI

    Speaker Pelosi slams California AI bill headed to Governor Newsom as ‘ignorant’

    August 29, 2024

    Why Honeywell is betting big on Gen AI

    August 29, 2024

    Ethically questionable or creative genius? How artists are engaging with AI in their work | Art and Design

    August 29, 2024

    “Elon Musk and Trump” arrested for burglary in disturbing AI video

    August 29, 2024

    Nvidia CFO says ‘enterprise AI wave’ has begun and Fortune 100 companies are leading the way

    August 29, 2024
  • Data
    1. Global 5G
    2. View All

    Global 5G Enterprise Market is expected to be valued at USD 34.4 Billion by 2032

    August 12, 2024

    Counterpoint predicts 5G will dominate the smartphone market in early 2024

    August 5, 2024

    Best Super Fast Download Companies — TradingView

    July 31, 2024

    Qualcomm’s new chipsets will power affordable 5G smartphones

    July 31, 2024

    Crypto Markets Rise on Strong US Economic Data

    August 29, 2024

    Microsoft approves construction of third section of Mount Pleasant data center campus

    August 29, 2024

    China has invested $6.1 billion in state-run data center projects over two years, with the “East Data, West Computing” initiative aimed at capitalizing on the country’s untapped land.

    August 29, 2024

    What is the size of the clinical data analysis solutions market?

    August 29, 2024
  • Internet

    NATO believes Russia poses a threat to Western internet and GPS services

    August 29, 2024

    Mpeppe grows fast, building traction among Internet computer owners

    August 29, 2024

    Internet Computer Whale Buys Mpeppe (MPEPE) at 340x ROI

    August 29, 2024

    Long-term internet computer investor adds PEPE rival to holdings

    August 29, 2024

    Biden-Harris Administration Approves Initial Internet for All Proposals in Mississippi and South Dakota

    August 29, 2024
  • WIFI

    4 Best Wi-Fi Mesh Networking Systems in 2024

    September 6, 2024

    Best WiFi deal: Save $200 on the Starlink Standard Kit AX

    August 29, 2024

    Sonos Roam 2 review | Good Housekeeping UK

    August 29, 2024

    Popular WiFi extender that eliminates dead zones in your home costs just $12

    August 29, 2024

    North American WiFi 6 Mesh Router Market Size, Share, Forecast, [2030] – அக்னி செய்திகள்

    August 29, 2024
  • 5G Antennas

    Nokia and Claro bring 5G to Argentina

    August 27, 2024

    Nokia expands FWA portfolio with new 5G devices – SatNews

    July 25, 2024

    Deutsche Telekom to operate 12,150 5G antennas over 3.6 GHz band

    July 24, 2024

    Vodafone and Ericsson develop a compact 5G antenna in Germany

    July 12, 2024

    Vodafone and Ericsson unveil new small antennas to power Germany’s 5G network

    July 11, 2024
  • Legacy
5gantennas.org5gantennas.org
Home»Data»Rise of generative AI heightens need for data quality
Data

Rise of generative AI heightens need for data quality

5gantennas.orgBy 5gantennas.orgJanuary 26, 2024No Comments11 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email


Data quality has always been critical to analytics.

If the data used to inform decisions isn’t good, neither will be the actions taken based on those decisions. But data quality has assumed even greater importance over the past year as generative AI becomes part of the decision-making process.

Large language models (LLMs) such as ChatGPT and Google Bard frequently respond to questions with incorrect responses called AI hallucinations. To reduce the frequency of hallucinations — which also include misleading and offensive information — related to queries about their business, enterprises have done one of two things.

They’ve either attempted to develop language models trained exclusively with their own data or imported public LLMs into a secure environment where they can add proprietary data to retrain those LLMs.

In both instances, data quality is crucial.

In response to the rising emphasis on data quality, data observability specialist Monte Carlo on Jan. 23 hosted Data Quality Day, a virtual event featuring panel discussions on how organizations can best guarantee that the data used to inform data products and AI models can be trusted.

“I’ve seen data quality come up as a problem over and over again,” said Chad Sanderson, co-founder and CEO of data community vendor Gable, during the streamed event.

Similarly, Barr Moses, co-founder and CEO of Monte Carlo, said data quality remains a problem for many organizations. She noted that one of the reasons she helped start Monte Carlo was that in her previous roles at customer success management platform vendor Gainsight — as senior director of business operations and technical success, and vice president of customer success operations — she and her teams often had to deal with bad data.

As the leader of a data team, I had various challenges. But maybe the main challenge was that the data was often wrong. We had one job, which was to get the data right, but it was inaccurate a lot of the time.
Barr MosesCo-founder and CEO, Monte Carlo

“As the leader of a data team, I had various challenges,” Moses said. “But maybe the main challenge was that the data was often wrong. We had one job, which was to get the data right, but it was inaccurate a lot of the time.”

Ultimately, improving data quality and ensuring that data can be trusted to inform analytics and AI applications comes down to a combination of technology and organizational processes, according to the panelists.

Importance of data quality

One of the promises of generative AI for businesses is to make data exploration and analysis available to more than just a small group of data experts within organizations.

For decades, analytics use within the enterprise has been stuck around a quarter of all employees. The main reason for the lack of expansion is that analytics platforms are complex. In particular, code is required to carry out most queries and analyses.

In recent years, many vendors have developed natural language processing (NLP) and low-code/no-code tools in an attempt to reduce the complexity of their platforms, but those tools have largely failed to markedly expand analytics use.

The NLP tools had small vocabularies and thus required users to know and use highly specific phrasing, while low-code/no-code tools enabled only cursory data exploration and analysis. To do in-depth analysis, significant training in the use of the platforms and data literacy were still required.

LLMs, however, have the vocabularies of the most expansive dictionaries and are trained to understand intent. Therefore, they substantially reduce some of the barriers that have held back expanded use of analytics. But to enable business users to ask questions of their data with LLMs, the LLMs need to be trained with an enterprise’s proprietary data.

Ask an LLM such as ChatGPT to write a song, and it can do that. Ask it to summarize a book, and it can do that. Ask it to generate code to build a data pipeline, and it can even do that.

But ask it what sales figures were in Nebraska during the past five winters, and either it will be unable to come up with a response or it could make up an answer. Ask it to then write a report based on those sales figures, and it will fail at that too.

Because it doesn’t have the company’s proprietary data.

Organizations need to train LLMs with their own data to answer questions relevant to their business. And that data needs to be good data or else the LLMs will either be unresponsive or deliver incorrect responses that might seem plausible enough to fool a user and lead to a bad decision.

“The key to making generative AI great is by introducing proprietary enterprise data into generative AI pipelines,” Moses said. “If that is not accurate or not reliable, then all of [an organization’s] generative AI efforts will be moot.”

Those generative AI pipelines, meanwhile, need huge amounts of data pulled into models from databases, data warehouses, data lakes and data lakehouses. They need to be able to wrangle relevant structured and unstructured data and combine those disparate data types to give organizations as complete a view of their operations as possible. And they need to be able to do so quickly so that decisions can be made in real time when needed.

Monitoring potentially billions of data points for quality is far more than even a team of humans can manage. Technology, therefore, is now an integral part of ensuring data quality.

Virtual event screenshot of panelists Shane Murray, Barr Moses and Chad Sanderson.
Panelists Shane Murray, Barr Moses and Chad Sanderson discuss data quality during Data Quality Day, a virtual event hosted by data observability specialist Monte Carlo.

Technology

One means of addressing data quality is with technology that automatically monitors data for accuracy and alerts data stewards about anomalies.

In particular, data observability platforms are designed to give data engineers and other stewards a view of their data as it moves from ingestion through the data pipeline to the point when it can be operationalized to inform decisions.

Data observability was a simple process when all data was kept on premises and stored in localized databases. The cloud, however, changed that. Most organizations now store at least some of their data in cloud-based warehouses, lakes and lakehouses. Most also store that cloud-based data in more than one repository, often managed by more than one vendor.

That makes data complicated to track and monitor. So does the explosion in data types with text, video, audio, IoT sensors and others all producing data that can now be captured and stored.

In response to the growing complexity of data, observability specialists including Monte Carlo, Acceldata and Datadog emerged with platforms that automatically test and monitor data throughout its lifetime.

Moses noted that humans can address known problems and develop tests to solve those specific problems. But they can’t address what they aren’t aware of.

“That approach has poor coverage,” she said.

Data quality monitoring tools, meanwhile, have the opposite problem, Moses continued. They automate data monitoring so that all of an organization’s data can be overseen, but they lack intuition and tend to flood data stewards with push notifications.

Data observability blends testing and automated monitoring to allow for full coverage without overburdening data teams with alerts every time they sense the slightest anomaly.

“Data observability uses machine learning to solve for being too specific and too broad,” Moses said. “It also introduces context so users can learn things like lineage, root cause analysis and impact analysis — things that make data engineers more effective in their thinking about data quality.”

Another technology that has the potential to improve data quality is generative AI itself, according to Sanderson.

He noted that some of the things generative AI does well are inferring context from code and understanding the intent of software, which enables generative AI to discover problems that might otherwise be overlooked, in the same way data observability does.

“I’ve been an infrastructure person my entire life, so I was very skeptical of generative AI as a weird AI thing that was going to come and go,” Sanderson said. “I think it’s really going to play a big role in data quality and governance over the next five to 10 years.”

One more technology that could play an important role in improving data quality is testing tools from vendors such as DBT Labs, according to Dana Neufeld, data product manager at Fundbox, a small-business loan platform vendor. She noted that such tools enable engineers to run tests on their data during the development process so that data quality issues can be addressed before pipelines and applications are put into production.

“It’s testing within DBT by developers before they release their code,” Neufeld said. “It is way easier to spot data quality issues within the code before it gets released.”

People and processes

Beyond technology, organizational processes developed by people and carried out by people need to be part of addressing data quality, according to Moses.

Buy-in from top executives is needed for any organizational undertaking to have a chance at success. But assuming C-suite executives understand the importance of data quality, a hierarchical structure that lays out who is responsible for data — and accountable for ensuring data quality — below the top level is also important, Moses said.

Some enterprises still have centralized data teams that oversee every aspect of data operations and parse data out only upon request.

Others, however, have adopted decentralized data management strategies such as data mesh and data fabric, and have empowered business users with self-service analytics platforms. Such strategies and tools make their organizations more flexible, and able to act and react faster than those with centralized data management. But they also allow more people to work with and potentially alter data.

Such organizations need definitive data policies and hierarchies to decrease the risk of lowering data quality.

“Five or 10 years ago, there were maybe one or two people responsible for data,” Moses said. “There was a long lag time to make sure data was accurate, and it was used by a small number of people. The world we live in today is vastly different from that.”

Now, a lot more people are involved, she continued. Even within data teams, there are data engineers, data analysts, machine learning engineers, data stewards, governance experts and other specialists.

“It’s important to ask who owns data quality, because when you don’t have a single owner, it’s really hard to determine accountability,” Moses said. “When the data is wrong, everyone starts finger-pointing. That’s not great for culture, which is toxic, and it also doesn’t lead to a solution. It’s actually valuable to identify an owner.”

In a sense, that data ownership is part of a communication process, which is another key element of managing data quality, according to Sanderson.

He noted that data is essentially a supply chain that includes producers, consumers, distributors and brokers, with people at each stage of the pipeline playing different roles. Communication between those people is crucial so that everyone understands how data needs to be treated to maintain its quality.

“Communication is more of a necessity of the system functioning than a nice-to-have,” Sanderson said. “There are a lot of processes that teams can start following to create better communication.”

One is knowing the steps of the supply chain so that data lineage is understood. Another is recognizing which data is most important.

Sanderson said organizations typically use only about 25% of their data. Beyond that, perhaps about 5% of the organization’s data is the most informative and valuable. Communicating what that 5% is, and the importance of keeping its quality, is therefore significant.

“My recommendation has been identifying tier 1 data … where if there is some quality issue, it has a tangible financial impact,” Sanderson said. “If they can solve those issues, then it’s easy to get leadership and upstream teams taking accountability.”

Ultimately, with so much data and so many people handling data, there’s no single way to fully guarantee data quality, according to Moses.

However, applying appropriate technologies with the right organizational processes can greatly improve an enterprise’s chances of maintaining data quality and delivering trustworthy data products.

“There’s no magic wand solution,” Moses said. “I don’t think I’ve seen an organization, large or small, that you can say, ‘This is done perfect.’ But there are things that are important.”

Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleNavvis & Company data breach affects 462,000 Hawaii Medical Services Association customers | Console & Associates, PC
Next Article The mobile phone company plans to slow down some users’ home internet speeds if “congestion” is severe.
5gantennas.org
  • Website

Related Posts

Crypto Markets Rise on Strong US Economic Data

August 29, 2024

Microsoft approves construction of third section of Mount Pleasant data center campus

August 29, 2024

China has invested $6.1 billion in state-run data center projects over two years, with the “East Data, West Computing” initiative aimed at capitalizing on the country’s untapped land.

August 29, 2024
Leave A Reply Cancel Reply

You must be logged in to post a comment.

Latest Posts

4 Best Wi-Fi Mesh Networking Systems in 2024

September 6, 2024

India is on the brink of a new revolution in telecommunications and can lead the world with 6G: Jyotiraditya Scindia

August 29, 2024

Speaker Pelosi slams California AI bill headed to Governor Newsom as ‘ignorant’

August 29, 2024

Crypto Markets Rise on Strong US Economic Data

August 29, 2024
Don't Miss

China is testing 6G technology in space

By 5gantennas.orgFebruary 14, 2024

China Mobile, the world’s largest carrier by mobile phone subscribers, successfully launched the first related…

6G could add sensing to cellular networks

February 19, 2024

SK Telecom and Docomo unveil green mobile network and 6G goals in new white paper

February 22, 2024

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to 5GAntennas.org, your reliable source for comprehensive information on 5G technology, artificial intelligence (AI), and data-related advancements. We are passionate about staying at the forefront of these cutting-edge fields and bringing you the latest insights, trends, and developments.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks

4 Best Wi-Fi Mesh Networking Systems in 2024

September 6, 2024

India is on the brink of a new revolution in telecommunications and can lead the world with 6G: Jyotiraditya Scindia

August 29, 2024

Speaker Pelosi slams California AI bill headed to Governor Newsom as ‘ignorant’

August 29, 2024
Most Popular

5G technology and its impact on connectivity | By Hafsa Sajjad | January 2024

January 23, 2024

Gogo updates investors on latest 5G delays

August 8, 2023

5G promises more personalized banking for connected consumers

February 6, 2024
© 2025 5gantennas. Designed by 5gantennas.
  • Home
  • About us
  • Contact us
  • DMCA
  • Privacy Policy
  • About Creator

Type above and press Enter to search. Press Esc to cancel.