Today, the term “AI art” means “static images”. Never ever. Meta is showcasing Make-A-Video, where the company combines artificial intelligence and interpolation to create short, looping GIF videos.
Make-A-Video.studio is not yet available to the general public. Instead, it is shown as something the Meta itself can do with this technology. And yes, while it’s technically a video – in the sense that there’s more than a few frames of AI art here – it’s still probably closer to a traditional GIF than anything else.
Regardless of. What Make-A-Video does is triple considering the demo on the Meta site. First, the technology can take two related images—whether it’s a drop of water flying or a photo of a horse at full gallop—and create in-between shots. More impressively, Make-A-Video seems to be able to take a still image and apply movement to it in an intelligent way, for example by taking a still image of a boat and creating a short video of it moving through the waves.
Finally, Make-A-Video can bring it all together. From the “Teddy bear draws a portrait” prompt, Meta showed a small GIF of an animated teddy bear drawing itself. This shows not only the ability to create AI art, but also to infer actions from it, as outlined in the company’s research paper.
“The Make-A-Video research builds on recent advances in text-to-image technology designed to enable text-to-video conversion,” Meta explains. “The system uses images with descriptions to learn what the world looks like and how it is often described. It also uses unmarked videos to see how the world moves. With this data, Make-A-Video allows you to bring your imagination to life, creating whimsical, one-of-a-kind videos using just a few words or lines of text.”
This probably means that Meta is training the algorithm on the actual video it captured. What is not clear is how this video is entered. Facebook’s research paper on the subject does not specify how the video might be obtained in the future, and one wonders whether an anonymous video taken from Facebook can be used as the basis for future art.
Meta claims to be able to interpolate video from two linked images.
It’s not entirely new, at least not conceptually. Animations like VQGAN+clip Turbo can turn a text invitation into an animated video, but Meta’s job seems to be more complex. However, it’s hard to say until the model is released to the general public.
However, it takes the art of AI to another dimension: movement. How long will it be before Midjourney and Stable Diffusion do the same on your PC?