- Syntha AI
- Posts
- How to generate video from a text prompt
How to generate video from a text prompt
Gen-2 from Runway. ModelScope. Github Copilot X. Text to video. Image to video.
We already saw huge progress in image generation. Video generation is a more challenging task. Video is a sequence of frames. Each of these frames needs to be realistic (this in itself is a challenge). And all the frames should be connected to each other.
I already covered video generation in one of my previous posts. Those are great methods. But they require some reference videos for new video generation. Now it’s time for a couple of new models that can generate videos from a text prompt or an image.
ModelScope text to video
Recently released ModelScope is a new diffusion model for a text-to-video generation. It works in a similar way to Generative AI for images: turns a prompt into a video.
Here is an example of how this model works:
Source: https://t.me/monkeyinlaw/1119
The model has a couple of limitations:
It works with 256x256 only. But this is fine, we are just at the beginning of the video generation boom.
We could see the word “Shutterstock” on almost every video they generate 🤦♂️. I wonder where they got the data for training.
Website | Demo | Weights | Google Colab
Gen-2: text to video, image to video, text+image to video
Gen-1 is a fantastic generative AI model for video generation made by Runway. It has such modes as Stylization, Storyboard, Mask, Render, and Customization which I described in this post. However, in each of its modes, Gen-1 requires some reference video to modify. In other words, it is not able to generate a video without another existing video.
Now new Gen-2 model makes it possible. It adds a couple of new modes: text to video, text+image to video and image to video. Let us look at them.
Text to video
Input text prompt: The late afternoon sun peeking through the window of a New York City loft.
Generated video. Source: https://research.runwayml.com/gen2
Text + Image to Video
Input text prompt: A low angle shot of a man walking down a street, illuminated by the neon signs of the bars around him.
Input image. Source: https://research.runwayml.com/gen2
Generated video. Source: https://research.runwayml.com/gen2
Image to Video
Input image. Source: https://research.runwayml.com/gen2
Generated video. Source: https://research.runwayml.com/gen2
And here is a video made with Gen2:
“An Uncanny Hall Of Mirrors” A completely synthesized reality made with @runwayml beta of #gen2 … and wow… it kinda rattled me to my core. All you need is an image or a text prompt - still a very early version but another paradigm shift in the world of AI filmmaking. #aiart #ai
— Paul Trillo (@paultrillo)
11:25 PM • Mar 22, 2023
AI News of the week: GitHub Copilot X
GitHub has announced the launch of GitHub Copilot X, an AI-powered development tool that represents a significant milestone in the future of software development. Copilot X expands on the capabilities of the original GitHub Copilot, which is a great auto-completion plugin, which I use every day.
GitHub Copilot X introduces chat and voice functionality, as well as integration with pull requests, command line, and documentation.
GitHub Copilot Chat is integrated into an IDE and helps to understand code, analyse errors and suggest fixes.
Github Copilot for Pull Requests automatically generates pull request descriptions based on code changes.
Github Copilot for Documentation will help you understand code documentation and will be able to answer questions about docs.
GitHub Copilot for Command Line Interface (CLI) will help your write commands faster.
The way we write code is changing dramatically. I am not afraid of being replaced by AI. Instead, I am happy I can use such productivity tools to write code much faster.
AI Tweet of the week
This is just the beginning. #Gen2
— Anastasis Germanidis (@agermanidis)
1:22 AM • Mar 21, 2023