Meta announces Make-A-Video, which generates video from text

Meta announces Make-A-Video, which generates video from text
Written by admin

Still image from an AI-generated video of a teddy bear painting a portrait.
Enlarge / Still image from an AI-generated video of a teddy bear painting a portrait.

Today, Meta announced Make a videoan AI-powered video generator that can create novel video content from text prompts or images, similar to existing image synthesis tools like DALL-E Y stable diffusion. You can also make variations of existing videos, although it is not yet available for public use.

On the Make-A-Video ad page, Meta shows example videos generated from text, including “a young couple walking in heavy rain” and “a teddy bear painting a portrait.” It also shows Make-A-Video’s ability to take a static source image and animate it. For example, a still photo of a sea turtle, once processed through the AI ​​model, may appear to be swimming.

The key technology behind Make-A-Video and why it’s here ahead of schedule some experts anticipated, is that it builds on existing work with text-to-image synthesis used with imagers such as OpenAI’s DALL-E. In July, Meta announced its own text-to-image AI model called Make a scene.

Instead of training the Make-A-Video model with tagged video data (for example, captioned descriptions of rendered actions), Meta took image synthesis data (trained still images with captions) and applied video training data unlabeled so that the model learns a sense of where a text or image might exist in time and space. She can then predict what comes after the image and show the scene moving for a short time.

“Using feature-preserving transformations, we extend the spatial layers at the model initialization stage to include temporal information,” Meta wrote in a blog post. White paper. “The extended spatial-temporal network includes new attention modules that learn the dynamics of the temporal world from a collection of videos.”

Meta has made no announcement about how or when Make-A-Video might be available to the public or who would have access to it. Target offers a Registration Form people can fill in if they are interested in trying it out in the future.

Meta recognizes that the ability to create photorealistic videos on demand presents certain social risks. At the bottom of the ad page, Meta says that all Make-A-Video’s AI-generated video content contains a watermark to “help ensure viewers know the video was AI-generated and not It’s a captured video.”

Yes history there is some guidance, they can follow competitive open source text-to-video models (some, like cogvideoalready exist), which could make Meta’s watermark protection irrelevant.

About the author


Leave a Comment