this is, conspiracy theory or prediction of the future. The idea of the internet being dominated by AI-generated content is already here, and it’s not a good one.
since then Chat GPT AI-generated content is steadily permeating the internet. Artificial intelligence has been around for decades. But his ChatGPT for consumers pushed AI into the mainstream, creating unprecedented access to advanced AI models and demands that businesses are eager to leverage.
As a result, businesses and users alike are leveraging generative AI to create massive amounts of content. While the initial concern is that there will be a large amount of content containing inaccuracies, gibberish, and misinformation, the long-term effect is that web content will deteriorate into completely useless garbage.
OpenAI’s new election rules are already being tested
Garbage comes in, garbage comes out
If you are thinking, There is already a lot of useless garbage on the internet, that’s true, but this is different. “There’s a lot of trash out there… but the variety and variety is incredible,” said Nader Henein, deputy analyst at management consulting firm Gartner. When LLMs feed each other’s content, it becomes lower quality and more vague, like a copy of a copy of an image.
Think of it this way. The first version of ChatGPT was the last model to be trained entirely on human-generated content. All subsequent models include training data that includes AI-generated content that is difficult to verify or even track. This makes the data unreliable and, frankly, garbage data. When that happens, “the quality and accuracy of the content is lost; and “You lose diversity. Everything starts to look the same,” said Henein, who studies data protection and artificial intelligence.
“Incestuous learning” is what Hennein calls it. “LLM is just one big family of hers, consuming and cross-pollinating each other’s content, and with each passing generation, there will be more trash, and the trash will overtake the quality content, and there will be more trash. Things start to get worse.”
As more AI-generated content is pushed onto the web; that Although content is generated by LLMs trained on AI-generated content, we are witnessing a future web that is completely homogeneous and completely untrustworthy. Also, it’s really boring.
The model collapses, the internet collapses
most people I already feel it Something is wrong.
Tweet may have been deleted
In a more high-profile example, art is being reproduced by robots.The book has been swallowed whole and reproduced by LLM without author’s permission. Images and videos using celebrities’ voices or likenesses are created without their consent or compensation.
However, existing copyright and intellectual property laws are already in place to protect against such violations. Additionally, some companies are embracing AI collaboration, like Grimes, who is proposing revenue-sharing deals with AI music creators and record labels. license agreement with AI technology companies.On the policy front, lawmakers counterfeit law To protect public figures from AI replicas. There are no regulations in place to solve all these problems, but we can at least imagine that they will.
But the decline in the overall quality of everything online is a more insidious phenomenon, and researchers have demonstrated why it’s about to get worse.
in study Researchers at Germany’s Johannes Gutenberg University found that “this self-consuming training loop initially improves both quality and variety,” and that this has implications for what may happen next. Match. “However, after several generations, the output diversity inevitably degenerates. We found that the rate of degeneration depends on the ratio of actual data to generated data.”
2 more academic paper A paper published in 2023 reached the same conclusion about the degradation of AI models when trained on synthetic, or AI-generated, data. A study by researchers from Oxford, Cambridge, Imperial College London, the University of Toronto, and the University of Edinburgh found that “the use of model-generated content in training introduces irreversible flaws in the resulting model, making the original The tail of the content distribution disappears.” This is called “model collapse.”
Similarly, researchers at Stanford University and Rice University found that “without sufficient fresh actual data for each generation of autophagy, [self-consuming] When looping, future generative models are doomed to progressively decrease their quality (precision) or diversity (recall). ”
Lack of diversity is the fundamental problem, explained Henein. Because even if AI models try to replace human creativity, they will move further and further away from it.
Overview of the AI-generated Internet
While model collapse looms, the AI-generated internet is already here.
Amazon has new features such as: AI-generated product review summaries. Tools from Google and Microsoft use AI to help draft emails and documents, and Indeed uses AI to help draft emails and documents. tool In September, recruiters will be able to create job descriptions generated by AI. Platforms like DALL-E 3 and Midjourney allow users to create and share their AI-generated images on the web.
It’s already out there, whether it’s outputting AI-generated content directly, like Amazon does, or offering services that allow users to output AI-generated content themselves, like Google, Microsoft, Indeed, OpenAI, and Midjourney.
And those are just tools and capabilities of big tech companies that claim to be doing some sort of surveillance. The real culprits are clickbait sites that pump out tons of low-quality, regurgitated content in order to gain high SEO rankings and profits.
Recent report 404 Media’s research uncovered a number of sites that “use AI to rapidly churn out content that rip off other media.” For a sample of this type of content that avoids plagiarism at the expense of consistency, check out the questionable news site Worldtimetodays.com. There, the first line of the 2023 article mentions Gina Carano’s firing. star wars reading material“It’s been some time since Gina Carano launched a fierce attack on Lucasfilm after she was fired.” star warSo, for better or worse, we were doomed to that fate. ”
Obviously, this text was generated by an AI.
Credit: Worldtimetodays.com
Google Scholar user discovered cache Number of academic papers containing the phrase “AI as a language model”. This means that part of the paper (or the entire paper, which everyone knows about) was written by a chatbot such as his ChatGPT. Research papers generated by AI are considered to have some kind of academic credibility, and may be published on news sites or blogs as authoritative references.
Tweet may have been deleted
Google searches may also show AI-generated portraits of celebrities instead of press photos or movie stills. A Google search for the late musician Israel Kamakawiwole, known for his ukulele cover of “Somewhere Over the Rainbow” Top results This is an AI prediction of what Kamakawiwole would look like if he were alive today.
If you search for an image of Keira Knightley on Google, you’ll see distorted renderings alongside actual photos of the actress that users have uploaded to OpenArt, Playground AI, and Dopamine Girl.
Keira doesn’t have that qualification.
Credit: Mashable
Needless to say, recently, porn deep fake Taylor Swift sells an Instagram ad using Tom Hanks’ likeness dental plana photo editing app with Scarlett Johansson’s face and voice without her consentAnd that fiery song by Drake and The Weeknd that was actually unauthorized. audio deep fake That sounded exactly like them.
If our search engine results are already unreliable and our models are almost certainly feeding on this junk, then we have crossed the threshold into the AI garbage age of the web. For now, the web we once knew is still somewhat recognizable, but the caveats are no longer abstract.
The internet hasn’t completely disappeared
Assuming that products like ChatGPT don’t achieve hurray and start reliably producing vibrant, exciting content that humans actually find enjoyable or useful for consumption, what happens next? ?
Communities and organizations are expected to fight back by protecting their content from AI models that attempt to sabotage it. The open, ad-supported, search-based web may be obsolete, but the Internet will evolve. The hope is that more reputable media sites will keep their content behind paywalls and reliable information will come from their subscriber newsletters.
It is expected that there will be more disputes over copyright and licenses. New York Times’ Lawsuit against Microsoft and OpenAI.I hope to see more tools like Solanaceae, an invisible tool that protects the model you trained on copyrighted images by trying to destroy them. We hope to see the development of sophisticated new watermarking and verification tools that prevent AI scraping.
On the contrary, you can also expect other news publications such as: Associated Press – and probably CNN, Fox, Time – Embracing generative AI and signing licensing agreements with companies like OpenAI.
As a tool like ChatGPT or Google SGE It is expected to replace traditional search and change the revenue model built on SEO.
But the silver lining to the model collapse is the loss of demand. Currently, the adoption of generative AI is driven by hype, and demand dries up once models trained on low-quality content become useless. All that remains (hopefully) are us weak humans with an irresistible urge to rant, overshare, inform, and otherwise express ourselves online.
topic
Artificial Intelligence ChatGPT