- Syntha AI
- DeepFloyd IF: generate images with text
DeepFloyd IF: generate images with text
New model by Stability AI + DeepFloyd lab
If you've ever used Generative AI models for images, you know they are bad at adding text to objects in an image. If you want to generate a restaurant with a sign — good luck. Models like Stable Diffusion, Midjouney, and DALLE 2 will change the text you want to generate a lot. The new model called DeepFloyd IF by Stability AI and DeepFloyd research lab finally can generate images with text. Correct text. Let’s dive deep into it.
Let us imagine that I decided to open a restaurant called Syntha AI. How it should look like? I asked two models: Midjouney (left) and DeepFloyd IF (right).
While Midjouney generated really cool image, the text in the sign is not what I expected. DeepFloyd IF did the job really well.
DeepFloyd IF examples
Here are some examples of generated images.
How DeepFloyd IF works
IF consists of the 3 diffusion parts, that are sequentially applied to the input text:
Text → 64x64 image
Text + 64x64 image → 256x256 image
Text + 256x256 image → 1024x1024 image
Basically, the model is very similar to Google’s Imagen, which is … not available even as a demo or API.
What DeepFloyd IF can do
Besides text-to-image generation, IF is capable of the following tasks.
Image-to-image translation with DeepFloyd IF
This is helpful when you want to change the style of your image. It works with any image, not just generated. For example, you can ask the model to change the style to be an oil painting, origami or abstract drawing.
Image super-resolution with DeepFloyd IF
If you want to increase the size of your image, you may consider using super-resolution mode. This is possible thanks to the second and third parts of the neural network.
Image Inpainting with DeepFloyd IF
Inpainting is when you have a missing part in the image and want to draw something there. For example, you may want to remove people from your landscape photo and inpaint the background.
How to use DeepFloyd IF
While DeepFloyd IF at the moment is available for research purposes only, the authors promise to release the model for commercial usage later. So it is a good moment to learn how to use it. Here are some use cases:
Create a web service, that can generate logos with AI. Now it is possible to add a company name to a generated logo.
Create a service, that would automatically generate videos based on song lyrics. Here is an example:
Create an API or web service for image inpainting and super-resolution. This functionality can also be available as a plugin for existing tools like Canva or Photoshop.
Well, the text is not always 100% correct.
Thank you for reading,