• Home
  • AI News
  • AI Startups
  • Deep Learning
  • Interviews
  • Machine-Learning
  • Robotics

Subscribe to Updates

Get the latest creative news from FooBar about art, design and business.

What's Hot

Internet-Scale Information Has Pushed Unimaginable Progress in AI, However Do We Actually Want All That Information? Meet SemDeDup: A New Technique to Take away Semantic Duplicates in Internet Information With Minimal Efficiency Loss

March 23, 2023

Microsoft AI Introduce DeBERTa-V3: A Novel Pre-Coaching Paradigm for Language Fashions Primarily based on the Mixture of DeBERTa and ELECTRA

March 23, 2023

Assume Like this and Reply Me: This AI Strategy Makes use of Lively Prompting to Information Giant Language Fashions

March 23, 2023
Facebook Twitter Instagram
The AI Today
Facebook Twitter Instagram Pinterest YouTube LinkedIn TikTok
SUBSCRIBE
  • Home
  • AI News
  • AI Startups
  • Deep Learning
  • Interviews
  • Machine-Learning
  • Robotics
The AI Today
Home»Machine-Learning»Meet MetaPortrait: An Identification-Preserving Speaking Head Technology Framework
Machine-Learning

Meet MetaPortrait: An Identification-Preserving Speaking Head Technology Framework

By January 12, 2023Updated:January 12, 2023No Comments5 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Reddit WhatsApp Email
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email


Pc science has lately entered a brand new period through which Synthetic Intelligence (AI) know-how can be utilized to create detailed and lifelike photos. An important enchancment has been introduced within the subject of multimedia era (as an illustration, text-to-text, text-to-image, image-to-image, and image-to-text era). Because of the profitable launch of many current generative fashions like Secure Diffusion and Dall-E (text-to-image) or ChatGPT (text-to-text) from OpenAI, these applied sciences are quickly enhancing and capturing folks’s pursuits. Apart from the beforehand talked about era, these fashions have been developed for a lot of totally different objectives. One other essential utility is the so-called speaking head era.

For individuals who have no idea it, speaking head era represents the duty of producing a speaking face from a set of photos of an individual.

Digital actuality, face-to-face stay chat, and digital avatars in video games and media are only a few locations speaking heads have discovered vital use. Current advances in neural rendering approaches have surpassed these achieved with expensive driving sensors and complicated 3D human modeling. Regardless of the rising realism and higher rendering decision that these works obtain, id preservation continues to be onerous to realize because the human visible system is so delicate to even the slightest change in an individual’s face form. The work offered on this article makes an attempt to create a speaking face that appears real and might transfer in accordance with the driving force’s movement utilizing solely a single supply image (one-shot).

The concept is to develop an ID-preserving speaking head era framework, which advances earlier strategies in two points. First, versus interpolating from sparse move, we declare that dense landmarks are essential to attaining correct geometry-aware move fields. Second, impressed by face-swapping strategies, we adaptively fuse the supply

id throughout synthesis in order that the community higher preserves the important thing traits of the picture portrait.

The image depicted beneath exhibits the general framework structure.

The enter to the mannequin is twin. First, a picture of an individual might be utilized because the supply picture, and a sequence of driving video frames is requested to information the video era. The mannequin is certainly requested to generate an output video with the motions derived from the driving video whereas sustaining the id of the supply picture. 

Step one is landmark detection. The authors declare that dense landmark prediction is the important thing to a geometry-aware warping subject estimation, utilized in later levels to seize and information the pinnacle motion. For this objective, a prediction mannequin has been educated (on artificial faces) to ease the landmark acquisition course of. A easy method for processing these landmarks can be to concatenate them channel-wise. Nevertheless, this operation is computationally demanding, given the numerous channels concerned. Therefore, within the paper, a distinct technique has been offered. The landmark factors are related by way of a line and differentiated by way of colours. 

The second step is the warping subject era. For this activity, the landmarks of the supply and driving photos are concatenated with the supply picture. Moreover, the warping subject prediction is conditioned to a latent vector produced from the concatenated photos.

The third step entails identity-preserving refinement. If the supply picture have been warped straight with the expected move subject, artifacts would inevitably come up, and the id will possible not be preserved. Because of this, the authors introduce an identity-preserving refinement community that takes the warping subject prediction, the supply picture, and an id embedding of the picture (extracted by way of a pre-trained face recognition mannequin) to generate the semantically-preserved pushed body.

The final step entails upsampling the frames. Doing this naively with out contemplating the temporal consistency between frames would produce artifacts within the output video. Subsequently, the proposed answer features a temporal super-resolution community to account for temporal relationships throughout adjoining frames. Particularly, it leverages a pretrained

StyleGAN mannequin and 3D convolution (within the spatio-temporal area), applied in a U-Internet module. The output video by way of super-resolution can have a dimension of 512×512.

The picture beneath represents the comparability between the proposed structure and state-of-the-art approaches.

This was the abstract of MetaPortrait, a novel framework to handle the speaking head era drawback. If you’re , you’ll find extra info within the hyperlinks beneath.


Try the Paper, Github, and Challenge. All Credit score For This Analysis Goes To the Researchers on This Challenge. Additionally, don’t neglect to hitch our Reddit Web page, Discord Channel, and E mail E-newsletter, the place we share the newest AI analysis information, cool AI tasks, and extra.



Daniele Lorenzi obtained his M.Sc. in ICT for Web and Multimedia Engineering in 2021 from the College of Padua, Italy. He’s a Ph.D. candidate on the Institute of Info Know-how (ITEC) on the Alpen-Adria-Universität (AAU) Klagenfurt. He’s presently working within the Christian Doppler Laboratory ATHENA and his analysis pursuits embody adaptive video streaming, immersive media, machine studying, and QoS/QoE analysis.


Related Posts

Internet-Scale Information Has Pushed Unimaginable Progress in AI, However Do We Actually Want All That Information? Meet SemDeDup: A New Technique to Take away Semantic Duplicates in Internet Information With Minimal Efficiency Loss

March 23, 2023

Microsoft AI Introduce DeBERTa-V3: A Novel Pre-Coaching Paradigm for Language Fashions Primarily based on the Mixture of DeBERTa and ELECTRA

March 23, 2023

Assume Like this and Reply Me: This AI Strategy Makes use of Lively Prompting to Information Giant Language Fashions

March 23, 2023

Leave A Reply Cancel Reply

Trending
Machine-Learning

Internet-Scale Information Has Pushed Unimaginable Progress in AI, However Do We Actually Want All That Information? Meet SemDeDup: A New Technique to Take away Semantic Duplicates in Internet Information With Minimal Efficiency Loss

By March 23, 20230

The expansion of self-supervised studying (SSL) utilized to bigger and bigger fashions and unlabeled datasets…

Microsoft AI Introduce DeBERTa-V3: A Novel Pre-Coaching Paradigm for Language Fashions Primarily based on the Mixture of DeBERTa and ELECTRA

March 23, 2023

Assume Like this and Reply Me: This AI Strategy Makes use of Lively Prompting to Information Giant Language Fashions

March 23, 2023

Meet ChatGLM: An Open-Supply NLP Mannequin Skilled on 1T Tokens and Able to Understanding English/Chinese language

March 23, 2023
Stay In Touch
  • Facebook
  • Twitter
  • Pinterest
  • Instagram
  • YouTube
  • Vimeo
Our Picks

Internet-Scale Information Has Pushed Unimaginable Progress in AI, However Do We Actually Want All That Information? Meet SemDeDup: A New Technique to Take away Semantic Duplicates in Internet Information With Minimal Efficiency Loss

March 23, 2023

Microsoft AI Introduce DeBERTa-V3: A Novel Pre-Coaching Paradigm for Language Fashions Primarily based on the Mixture of DeBERTa and ELECTRA

March 23, 2023

Assume Like this and Reply Me: This AI Strategy Makes use of Lively Prompting to Information Giant Language Fashions

March 23, 2023

Meet ChatGLM: An Open-Supply NLP Mannequin Skilled on 1T Tokens and Able to Understanding English/Chinese language

March 23, 2023

Subscribe to Updates

Get the latest creative news from SmartMag about art & design.

Demo

The Ai Today™ Magazine is the first in the middle east that gives the latest developments and innovations in the field of AI. We provide in-depth articles and analysis on the latest research and technologies in AI, as well as interviews with experts and thought leaders in the field. In addition, The Ai Today™ Magazine provides a platform for researchers and practitioners to share their work and ideas with a wider audience, help readers stay informed and engaged with the latest developments in the field, and provide valuable insights and perspectives on the future of AI.

Our Picks

Internet-Scale Information Has Pushed Unimaginable Progress in AI, However Do We Actually Want All That Information? Meet SemDeDup: A New Technique to Take away Semantic Duplicates in Internet Information With Minimal Efficiency Loss

March 23, 2023

Microsoft AI Introduce DeBERTa-V3: A Novel Pre-Coaching Paradigm for Language Fashions Primarily based on the Mixture of DeBERTa and ELECTRA

March 23, 2023

Assume Like this and Reply Me: This AI Strategy Makes use of Lively Prompting to Information Giant Language Fashions

March 23, 2023
Trending

Meet ChatGLM: An Open-Supply NLP Mannequin Skilled on 1T Tokens and Able to Understanding English/Chinese language

March 23, 2023

Etienne Bernard, Co-Founder & CEO of NuMind – Interview Sequence

March 22, 2023

This AI Paper Proposes COLT5: A New Mannequin For Lengthy-Vary Inputs That Employs Conditional Computation For Greater High quality And Quicker Velocity

March 22, 2023
Facebook Twitter Instagram YouTube LinkedIn TikTok
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms
  • Advertise
  • Shop
Copyright © MetaMedia™ Capital Inc, All right reserved

Type above and press Enter to search. Press Esc to cancel.