June 10, 2022
DALL·E mini, an open source alternative to DALL·E
By Sofía Sánchez González
Surely since DALL·E 2 came out you’ve been wanting to try it out, but since it’s not available to everyone you have been left wanting. In this post we offer you a way to try it: DALL·E mini, an open source alternative to DALL·E.
What are image-generating models?
But first things first. DALL·E is the first creative artificial intelligence model capable of generating images from texts. In other words, it combines both an understanding of natural language with the generating of realistic images.
But DALL·E’s artistic talents don’t end with a simple snail drawing. If you wanted, it could generate multiple harp-shaped snail illustrations, or something more bizarre than you could even imagine. Just provide a description and it will generate a number of alternatives.
It can generate images of anything you can think of. And when we say anything, we mean absolutely anything! This model can:
-
- Transform existing images
- Create anthropomorphic animals and objects
- Combine concepts which are seemingly unrelated
- Represent text
Here are some figures on the DALL·E model:
- 12 million parameters
- 1,280 tokens (256 for text and 1024 for images)
In short, the image-generating models are diffusion models to synthesize image + large language models for the text.
What’s new with DALL·E 2?
Well, the truth is, not much. DALL·E was already using diffusion models, now they have simply scaled it up with larger datasets and parameters in order to allow for a larger capacity.
The process has been the same as with GPT-2 and GPT-3. At a structural level it is the same model, but during the training the new one has been entered with a much larger text. DALL.E 2 works on a 3.5 billion parameter model while using another 1.5 billion parameter model to improve the resolution of the digitally produced images.
Impressive numbers, but the problem is that the model is only available to very few. There’s even a waiting list to try it out. That is why we offer you an alternative.
DALL·E mini, an open source alternative to DALL·E
This image generator is available on the Hugging Face profile because it is an open source alternative to DALL·E. As the name suggests, it is a ‘mini’ version of DALL·E, while still boasting some incredible results.
This is what its creator, Boris Dayma, explains to us on the project blog:
The model is trained by looking at millions of images from the internet with their associated captions. Over time, it learns how to draw an image from a text prompt.
Some of the concepts are learned from memory as it may have seen similar images. However, it can also learn how to create unique images that don’t exist such as “the Eiffel tower is landing on the moon” by combining multiple concepts together.
Fortunately, the goal of Hugging Face is to democratize artificial intelligence. So all Internet users can try it!
Coming soon, DALL·E Mega
And we have good news, because the creator of DALL·E mini is training DALL·E Mega, an even more powerful imager. Even the training can be viewed for free via this link! Anyone can see the learning curve and even see the parameters that change.
Dayma promises that it will have higher quality. It will also be available on the Hugging Face platform and will have a demo like the one for DALL·E mini.
Imagen from Google
Another image builder model option is Google Imagen. According to various market analyses, it is the generator that offers the best results. But the downside is that it’s not available to everyone and can only be tested at the moment. For now, we have to settle for a report describing its virtues. On a technical level, it has a huuuuge LLM.
About Narrativa
Narrativa is an internationally recognized content services company that uses its proprietary artificial intelligence and machine learning platforms to build and deploy digital content solutions for enterprises. Its technology suite, consisting of data extraction, data analysis, natural language processing (NLP) and natural language generation (NLG) tools, all seamlessly work together to power a lineup of smart content creation, automated business intelligence reporting and process optimization products for a variety of industries.
Contact us to learn more about our solutions!
Share