• Home
  • AI News
  • AI Startups
  • Deep Learning
  • Interviews
  • Machine-Learning
  • Robotics

Subscribe to Updates

Get the latest creative news from FooBar about art, design and business.

What's Hot

UCSD Researchers Open-Supply Graphologue: A Distinctive AI Approach That Transforms Giant Language Fashions Such As GPT-4 Responses Into Interactive Diagrams In Actual-Time

September 24, 2023

Analysis at Stanford Introduces PointOdyssey: A Massive-Scale Artificial Dataset for Lengthy-Time period Level Monitoring

September 23, 2023

Google DeepMind Introduces a New AI Software that Classifies the Results of 71 Million ‘Missense’ Mutations 

September 23, 2023
Facebook Twitter Instagram
The AI Today
Facebook Twitter Instagram Pinterest YouTube LinkedIn TikTok
SUBSCRIBE
  • Home
  • AI News
  • AI Startups
  • Deep Learning
  • Interviews
  • Machine-Learning
  • Robotics
The AI Today
Home»Machine-Learning»Baidu AI Researchers Introduce VideoGen: A New Textual content-to-Video Era Strategy That Can Generate Excessive-Definition Video With Excessive Body Constancy
Machine-Learning

Baidu AI Researchers Introduce VideoGen: A New Textual content-to-Video Era Strategy That Can Generate Excessive-Definition Video With Excessive Body Constancy

By September 12, 2023Updated:September 12, 2023No Comments3 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Reddit WhatsApp Email
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email


Textual content-to-image (T2I) technology programs like DALL-E2, Imagen, Cogview, Latent Diffusion, and others have come a great distance lately. However, text-to-video (T2V) technology stays a tough challenge as a result of want for high-quality visible content material and temporally easy, life like movement equivalent to the textual content. As well as, large-scale databases of text-video combos are very exhausting to return throughout. 

A current analysis by Baidu Inc. introduces VideoGen, a way for making a high-quality, seamless film from textual descriptions. To assist direct the creation of T2V, the researchers first constructed a high-quality picture utilizing a T2I mannequin. Then, they use a cascaded latent video diffusion module that generates a collection of high-resolution easy latent representations primarily based on the reference picture and the textual content description. When needed, additionally they make use of a flow-based method to upsample the latent illustration sequence in time. In the long run, the crew skilled a video decoder to transform the sequence of latent representations into an precise video.

Making a reference picture with the assistance of a T2I mannequin has two distinct benefits. 

  1. The ensuing video’s visible high quality has improved. The proposed technique takes benefit of the T2I mannequin to attract from the a lot bigger dataset of image-text pairs, which is extra various and information-rich than the dataset of video-text pairs. In comparison with Imagen Video, which makes use of image-text pairings for joint coaching, this technique is extra environment friendly throughout the coaching part. 
  2. A cascaded latent video diffusion mannequin might be guided by a reference picture, permitting it to be taught video dynamics fairly than visible content material. The crew believes that is an additional advantage above strategies that solely use the T2I mannequin parameters.

The crew additionally mentions that textual description is pointless for his or her video decoder to supply a film from the latent illustration sequence. By doing so, they practice the video decoder on a much bigger information pool, together with video-text pairs and unlabeled (unpaired) movies. In consequence, this technique improves the smoothness and realism of the created video’s movement because of the high-quality video information we use.

As findings counsel, VideoGen represents a major enchancment over earlier strategies of text-to-video technology when it comes to each qualitative and quantitative analysis.


Take a look at the Paper and Venture. All Credit score For This Analysis Goes To the Researchers on This Venture. Additionally, don’t neglect to hitch our 30k+ ML SubReddit, 40k+ Fb Neighborhood, Discord Channel, and E-mail Publication, the place we share the most recent AI analysis information, cool AI tasks, and extra.

In case you like our work, you’ll love our publication..



Dhanshree Shenwai is a Laptop Science Engineer and has an excellent expertise in FinTech corporations masking Monetary, Playing cards & Funds and Banking area with eager curiosity in purposes of AI. She is smitten by exploring new applied sciences and developments in at the moment’s evolving world making everybody’s life straightforward.


🚀 The tip of venture administration by people (Sponsored)

Related Posts

UCSD Researchers Open-Supply Graphologue: A Distinctive AI Approach That Transforms Giant Language Fashions Such As GPT-4 Responses Into Interactive Diagrams In Actual-Time

September 24, 2023

Researchers from Seoul Nationwide College Introduces Locomotion-Motion-Manipulation (LAMA): A Breakthrough AI Methodology for Environment friendly and Adaptable Robotic Management

September 23, 2023

Unlocking Battery Optimization: How Machine Studying and Nanoscale X-Ray Microscopy May Revolutionize Lithium Batteries

September 23, 2023

Leave A Reply Cancel Reply

Misa
Trending
Machine-Learning

UCSD Researchers Open-Supply Graphologue: A Distinctive AI Approach That Transforms Giant Language Fashions Such As GPT-4 Responses Into Interactive Diagrams In Actual-Time

By September 24, 20230

Giant Language Fashions (LLMs) have not too long ago gained immense recognition as a consequence…

Analysis at Stanford Introduces PointOdyssey: A Massive-Scale Artificial Dataset for Lengthy-Time period Level Monitoring

September 23, 2023

Google DeepMind Introduces a New AI Software that Classifies the Results of 71 Million ‘Missense’ Mutations 

September 23, 2023

Researchers from Seoul Nationwide College Introduces Locomotion-Motion-Manipulation (LAMA): A Breakthrough AI Methodology for Environment friendly and Adaptable Robotic Management

September 23, 2023
Stay In Touch
  • Facebook
  • Twitter
  • Pinterest
  • Instagram
  • YouTube
  • Vimeo
Our Picks

UCSD Researchers Open-Supply Graphologue: A Distinctive AI Approach That Transforms Giant Language Fashions Such As GPT-4 Responses Into Interactive Diagrams In Actual-Time

September 24, 2023

Analysis at Stanford Introduces PointOdyssey: A Massive-Scale Artificial Dataset for Lengthy-Time period Level Monitoring

September 23, 2023

Google DeepMind Introduces a New AI Software that Classifies the Results of 71 Million ‘Missense’ Mutations 

September 23, 2023

Researchers from Seoul Nationwide College Introduces Locomotion-Motion-Manipulation (LAMA): A Breakthrough AI Methodology for Environment friendly and Adaptable Robotic Management

September 23, 2023

Subscribe to Updates

Get the latest creative news from SmartMag about art & design.

The Ai Today™ Magazine is the first in the middle east that gives the latest developments and innovations in the field of AI. We provide in-depth articles and analysis on the latest research and technologies in AI, as well as interviews with experts and thought leaders in the field. In addition, The Ai Today™ Magazine provides a platform for researchers and practitioners to share their work and ideas with a wider audience, help readers stay informed and engaged with the latest developments in the field, and provide valuable insights and perspectives on the future of AI.

Our Picks

UCSD Researchers Open-Supply Graphologue: A Distinctive AI Approach That Transforms Giant Language Fashions Such As GPT-4 Responses Into Interactive Diagrams In Actual-Time

September 24, 2023

Analysis at Stanford Introduces PointOdyssey: A Massive-Scale Artificial Dataset for Lengthy-Time period Level Monitoring

September 23, 2023

Google DeepMind Introduces a New AI Software that Classifies the Results of 71 Million ‘Missense’ Mutations 

September 23, 2023
Trending

Researchers from Seoul Nationwide College Introduces Locomotion-Motion-Manipulation (LAMA): A Breakthrough AI Methodology for Environment friendly and Adaptable Robotic Management

September 23, 2023

Unlocking Battery Optimization: How Machine Studying and Nanoscale X-Ray Microscopy May Revolutionize Lithium Batteries

September 23, 2023

This AI Analysis by Microsoft and Tsinghua College Introduces EvoPrompt: A Novel AI Framework for Automated Discrete Immediate Optimization Connecting LLMs and Evolutionary Algorithms

September 23, 2023
Facebook Twitter Instagram YouTube LinkedIn TikTok
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms
  • Advertise
  • Shop
Copyright © MetaMedia™ Capital Inc, All right reserved

Type above and press Enter to search. Press Esc to cancel.