Close Menu
5gantennas.org5gantennas.org
  • Home
  • 5G
    • 5G Technology
  • 6G
  • AI
  • Data
    • Global 5G
  • Internet
  • WIFI
  • 5G Antennas
  • Legacy

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

4 Best Wi-Fi Mesh Networking Systems in 2024

September 6, 2024

India is on the brink of a new revolution in telecommunications and can lead the world with 6G: Jyotiraditya Scindia

August 29, 2024

Speaker Pelosi slams California AI bill headed to Governor Newsom as ‘ignorant’

August 29, 2024
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram
5gantennas.org5gantennas.org
  • Home
  • 5G
    1. 5G Technology
    2. View All

    Deutsche Telekom to operate 12,500 5G antennas over 3.6 GHz band

    August 28, 2024

    URCA Releases Draft “Roadmap” for 5G Rollout in the Bahamas – Eye Witness News

    August 23, 2024

    Smart Launches Smart ZTE Blade A75 5G » YugaTech

    August 22, 2024

    5G Drone Integration Denmark – DRONELIFE

    August 21, 2024

    Hughes praises successful private 5G demo for U.S. Navy

    August 29, 2024

    GSA survey reveals 5G FWA has become “mainstream”

    August 29, 2024

    China Mobile expands 5G Advanced, Chunghwa Telecom enters Europe

    August 29, 2024

    Ateme and ORS Boost 5G Broadcast Capacity with “World’s First Trial of IP-Based Statmux over 5G Broadcast” | TV Tech

    August 29, 2024
  • 6G

    India is on the brink of a new revolution in telecommunications and can lead the world with 6G: Jyotiraditya Scindia

    August 29, 2024

    Vodafonewatch Weekly: Rural 4G, Industrial 5G, 6G Patents | Weekly Briefing

    August 29, 2024

    Southeast Asia steps up efforts to build 6G standards

    August 29, 2024

    Energy efficiency as an inherent attribute of 6G networks

    August 29, 2024

    Finnish working group launches push for 6G technology

    August 28, 2024
  • AI

    Speaker Pelosi slams California AI bill headed to Governor Newsom as ‘ignorant’

    August 29, 2024

    Why Honeywell is betting big on Gen AI

    August 29, 2024

    Ethically questionable or creative genius? How artists are engaging with AI in their work | Art and Design

    August 29, 2024

    “Elon Musk and Trump” arrested for burglary in disturbing AI video

    August 29, 2024

    Nvidia CFO says ‘enterprise AI wave’ has begun and Fortune 100 companies are leading the way

    August 29, 2024
  • Data
    1. Global 5G
    2. View All

    Global 5G Enterprise Market is expected to be valued at USD 34.4 Billion by 2032

    August 12, 2024

    Counterpoint predicts 5G will dominate the smartphone market in early 2024

    August 5, 2024

    Qualcomm’s new chipsets will power affordable 5G smartphones

    July 31, 2024

    Best Super Fast Download Companies — TradingView

    July 31, 2024

    Crypto Markets Rise on Strong US Economic Data

    August 29, 2024

    Microsoft approves construction of third section of Mount Pleasant data center campus

    August 29, 2024

    China has invested $6.1 billion in state-run data center projects over two years, with the “East Data, West Computing” initiative aimed at capitalizing on the country’s untapped land.

    August 29, 2024

    What is the size of the clinical data analysis solutions market?

    August 29, 2024
  • Internet

    NATO believes Russia poses a threat to Western internet and GPS services

    August 29, 2024

    Mpeppe grows fast, building traction among Internet computer owners

    August 29, 2024

    Internet Computer Whale Buys Mpeppe (MPEPE) at 340x ROI

    August 29, 2024

    Long-term internet computer investor adds PEPE rival to holdings

    August 29, 2024

    Biden-Harris Administration Approves Initial Internet for All Proposals in Mississippi and South Dakota

    August 29, 2024
  • WIFI

    4 Best Wi-Fi Mesh Networking Systems in 2024

    September 6, 2024

    Best WiFi deal: Save $200 on the Starlink Standard Kit AX

    August 29, 2024

    Sonos Roam 2 review | Good Housekeeping UK

    August 29, 2024

    Popular WiFi extender that eliminates dead zones in your home costs just $12

    August 29, 2024

    North American WiFi 6 Mesh Router Market Size, Share, Forecast, [2030] – அக்னி செய்திகள்

    August 29, 2024
  • 5G Antennas

    Nokia and Claro bring 5G to Argentina

    August 27, 2024

    Nokia expands FWA portfolio with new 5G devices – SatNews

    July 25, 2024

    Deutsche Telekom to operate 12,150 5G antennas over 3.6 GHz band

    July 24, 2024

    Vodafone and Ericsson develop a compact 5G antenna in Germany

    July 12, 2024

    Vodafone and Ericsson unveil new small antennas to power Germany’s 5G network

    July 11, 2024
  • Legacy
5gantennas.org5gantennas.org
Home»AI»Google’s VLOGGER AI model can generate video avatars from images. What could be the problem?
AI

Google’s VLOGGER AI model can generate video avatars from images. What could be the problem?

5gantennas.orgBy 5gantennas.orgMarch 23, 2024No Comments7 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email


google-2024-vlogger-spalsh-image.png

VLOGGER takes a single photo of someone and goes beyond previous types of “talking head” software to capture it in high fidelity and at various lengths, including precise facial expressions and body movements down to the blink of an eye. You can create clips.

Google

The artificial intelligence (AI) community has become very good at creating fake videos. Check out OpenAI’s Sora, introduced last month with a slick imaginary flythrough. So you need to ask intelligent and practical questions. Should I stop doing all these videos?

Also: OpenAI has launched a text-to-video model and the results are surprising.see for yourself

This week, Google scholar Enric Corona and his colleagues answered, “Use the VLOGGER tool to control them.” VLOGGER can generate high-resolution videos of people talking based on a single photo. More importantly, VLOGGER can animate videos according to audio samples. This means the technology can animate videos as controlled human likenesses, or high-fidelity “avatars.”

All kinds of creations are possible with this tool. At the simplest level, Corona’s team suggests that VLOGGERs could have a big impact on help desk avatars because artificially speaking humans that look more realistic can “develop empathy.” I am. They suggest that the technology could “enable entirely new use cases, including enhanced online communications, education, and personalized virtual assistants.”

VLOGGERs could also lead to a new frontier of deepfakes, which make statements and actions made by real people appear to be genuine. Corona’s team will provide insight into the social impact of his VLOGGER in supplemental supporting materials. However, that material is not available on his GitHub page for the project. ZDNET contacted Corona for supporting documentation, but had not received a response at the time of publication.

Also: As AI agents become more widespread, the risks will increase, say academics.

As explained in the official paper “VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis”, Corona’s team aims to overcome the inaccuracies of state-of-the-art avatars. “Creating realistic videos of humans is still complex and ripe with artifacts,” Corona’s team wrote.

The research team pointed out that existing video avatars often cut off the body and hands, leaving only the face visible. VLOGGER allows you to view your entire torso along with your hand movements. Other tools typically offer only basic lip syncing, with limited variation in facial expressions and poses. VLOGGER can produce “high-resolution videos of head and upper body movements” […] Featuring a wide variety of facial expressions and gestures, it is “the first approach to generating speaking and moving humans given audio input.”

As the research team explained, “It is precisely automation and behavioral realism that [are] What we aim to do with this research: VLOGGER is a complex facial and body movement level with audio and animated visual representations designed to support natural conversations with human users. A multimodal interface to embodied conversational agents, featuring an increase in . ”

google-2024-vlogger-example

Based on a single photo (left), the VLOGGER software uses a process known as “diffusion” to predict which video frames (right) will accompany each moment of an audio file where someone is speaking, and then of video frames in high resolution. -Resolution quality.

Google

VLOGGER summarizes some of the latest trends in deep learning.

Multimodality aggregates many modes that AI tools can absorb and synthesize, such as text and audio, images and video.

Large language models such as OpenAI’s GPT-4 allow you to use natural language as input to perform various types of actions, such as creating paragraphs of text, songs, or images.

Researchers have also recently discovered numerous ways to create realistic-looking images and videos by improving “spreading.” The term originates from molecular physics and refers to the way particles of matter move from being highly concentrated in a particular area to becoming more diffuse as temperature increases. By analogy, bits of digital information appear to be “spread out” to the extent that they become incoherent due to digital noise.

Also: Beyond Gemini, open source AI has its own video tricks

With the rise of AI, noise is introduced into the image, the original image is reconstructed, and a neural network is trained to find the built rules. Diffusion is at the heart of the impressive image generation processes in Stable AI’s Stable Diffusion and OpenAI’s DALL-E. This is also how OpenAI creates smooth videos with Sora.

For VLOGGER, Corona’s team trained a neural network to associate a speaker’s audio with individual frames of that speaker’s video. The team used yet another recent innovation, his Transformer, to combine a diffusion process that reconstructs video frames from audio.

Transformer uses attention methods to predict video frames based on frames that have occurred in the past, in combination with audio. By predicting actions, neural networks learn how to accurately render hand and body movements and facial expressions frame by frame, in sync with the audio.

The final step uses the predictions from the first neural network, followed by a second neural network that also uses diffusion to enhance the generation of high-resolution frames for the video. This second step is also the high water mark for the data.

Also: Generative AI fails this very common ability of human thinking.

To create the high-resolution images, Corona’s team compiled MENTOR, a dataset featuring 800,000 “identities” of videos of people talking. MENTOR consists of 2,200 hours of video, which the team says is “the largest dataset ever used in terms of identity and length,” and more powerful than previous comparable datasets. even he claims to be 10 times larger.

The authors found that the process could be enhanced with a subsequent step called “fine-tuning.” Having already been “pre-trained” with MENTOR, sending a full-length video to her VLOGGER allows you to more realistically capture the idiosyncrasies of human head movements, such as blinking. “By fine-tuning the diffusion model with more data, VLOGGER can show how to better capture identity on monocular videos of subjects, for example when the reference image appears to have their eyes closed. The team calls this process “personalization.”

google-2024-vlogger-architecture

VLOGGER’s neural net is a combination of two different neural nets. The first uses “masked attention” through transformers to predict what pauses will occur within a frame of the video based on the sound from the recorded audio signal of the speaker. Masu. The second neural network uses diffusion to generate a coherent sequence of video frames using body movement and facial cues from the first neural network.

Google

The larger point of this approach is that it couples the predictions within a single neural network with high-resolution images, and what makes VLOGGER so provocative is that the program, like Sora, simply generates videos. It’s not just that. VLOGGER links that video to actions and expressions that you can control. Its lifelike video unfolds and can be manipulated like a puppet.

Also: Nvidia CEO Jensen Huang unveils next-generation ‘Blackwell’ chip family at GTC

“Our aim is to bridge the gap between recent video compositing efforts that can generate dynamic videos without controlling identity or pose, and controllable image generation methods,” says Corona’s team. is writing.

VLOGGER can not only be a voice-driven avatar, but also potentially lead to editing features such as changing the speaking subject’s mouth or eyes. For example, a virtual person in a video who blinks frequently can be changed to blink only a little or not at all. You can also narrow your wide-mouthed speech to make more detailed lip movements.

google-2024-vlogger-edited-videos.png

Enabling a way to control high-definition video via audio cues, VLOGGER paves the way for operations such as changing the speaker’s lip movements in each stretch of the video to be different from the original source video .

video blogger

Despite achieving new cutting-edge technology for simulating humans, the question the Corona team did not address is what the world should expect from misuse of the technology. It’s easy to imagine a portrait of a politician saying absolutely devastating things about, say, impending nuclear war.

Perhaps the next step in this Avatar game will be a shockingly authentic version of the Voight-Kampf test in Blade Runner that will allow society to tell which speakers are real and which are just deepfakes. It will be a neural network that will help you distinguish in this way.





Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleDSL internet price hike sparks anger in Egypt
Next Article GM stops sharing driver behavior data with data brokers, insurance companies following lawsuit
5gantennas.org
  • Website

Related Posts

Speaker Pelosi slams California AI bill headed to Governor Newsom as ‘ignorant’

August 29, 2024

Why Honeywell is betting big on Gen AI

August 29, 2024

Ethically questionable or creative genius? How artists are engaging with AI in their work | Art and Design

August 29, 2024
Leave A Reply Cancel Reply

You must be logged in to post a comment.

Latest Posts

4 Best Wi-Fi Mesh Networking Systems in 2024

September 6, 2024

India is on the brink of a new revolution in telecommunications and can lead the world with 6G: Jyotiraditya Scindia

August 29, 2024

Speaker Pelosi slams California AI bill headed to Governor Newsom as ‘ignorant’

August 29, 2024

Crypto Markets Rise on Strong US Economic Data

August 29, 2024
Don't Miss

Business News | Communications Minister Scindia promotes 6G leadership and nationwide broadband in meeting with telecom operators

By 5gantennas.orgAugust 24, 2024

New Delhi [India]August 24 (ANI): Union Telecom Minister Jyotiraditya Scindia along with Minister of State…

SingTel and SK Telecom prepare for the 6G future

July 8, 2024

Apple focuses on 6G for future iPhones

December 11, 2023

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to 5GAntennas.org, your reliable source for comprehensive information on 5G technology, artificial intelligence (AI), and data-related advancements. We are passionate about staying at the forefront of these cutting-edge fields and bringing you the latest insights, trends, and developments.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks

4 Best Wi-Fi Mesh Networking Systems in 2024

September 6, 2024

India is on the brink of a new revolution in telecommunications and can lead the world with 6G: Jyotiraditya Scindia

August 29, 2024

Speaker Pelosi slams California AI bill headed to Governor Newsom as ‘ignorant’

August 29, 2024
Most Popular

Will 5G make 2024 the most connected year in the industry?

December 1, 2023

The current state of 5G in the US and how it can improve

September 28, 2023

How 5G technology will transform gaming on the go

January 31, 2024
© 2026 5gantennas. Designed by 5gantennas.
  • Home
  • About us
  • Contact us
  • DMCA
  • Privacy Policy
  • About Creator

Type above and press Enter to search. Press Esc to cancel.