AI image generation technology is improving so rapidly that in just weeks or months the quality and features that are possible can be totally different. DALL-E 3 brings a leap in technology, but how does it stack up to MidJourney?
What’s Special About DALL-E 3?
We’ve covered theevolution and capabilities of MidJourneyin detail before, and so far it has been the go-to image generator for the best artistic output suitable for actual use. However, getting close to what you actually wanted in the generated image in MidJourney can be an extremely hit-and-miss affair. If you want precise control, you’d have to resort to usingStable Diffusionand one of its many mods, such asControlNet. However, Stable Diffusion is significantly more difficult to use, and both MidJourney and DALL-E 3 are superior in terms of ease of use.
DALL-E promises to stick much more exactly to the wording of your prompt. In other words, if you ask for specific character poses, details in the scene, or arrangements of objects in the scene, in theory, DALL-E 3 should give you what you asked for. We’ll be comparing DALL-E 3 and MidJourney using several prompts. The same prompt will be given to each AI generator.
Prompt 1: Artistic Flair
First, I just want to get a general feel for what each generator will do artistically, so we’ll start with a rather generic prompt:
Here’s the MidJourney image I thought was best.
And here’s the DALL-E 3 image I thought was best.
What’s interesting to note here is that ChatGPT (the front end for DALL-E 3 in this case) does not pass my exact prompt on to the image generator. Part of the main selling point for DALL-E 3 is that it uses ChatGPT (i.e. GPT-4) to take your idea and do the “prompt engineering” part of the work for you. So it will create much more detailed prompts to try and get better results. Here’s the prompt that ChatGPT created based on my request:
This presents a unique challenge when trying to compare the two image generators, because GPT is increasing the quality of the prompt. So, to make it fair, I fed the GPT-generated prompt into MidJourney and this is the result.
Now we have something much more comparable. However, which one wins? In this case, my opinion is that the DALL-E 3 image is closer to what I asked for, while the MidJourney image has a more distinct style and more artistic flair. MidJourney’s current V5 model excels at overall artistic flair in my opinion, but of course this is highly subjective.
For the rest of the comparisons, I will only be using the GPT-generated prompts for both image generators to cancel out my skill (or lack thereof) when it comes to crafting prompts. So in other words, I’ll ask ChatGPT for the image first, and then copy and paste the best image it generate’s prompt into MidJourney.
Prompt 2: Text Elements
You may have noticed that MidJourney tends to come up with gobbledygook whenever there’s text in a generated image. That’s because it’s generating stuff that looks like letters, but aren’t really letters. So T-shirts with text, or store signs won’t have any sensible text. DALL-E 3 promises to create whatever text you like and place it correctly in the frame, so let’s test that. Here’s the prompt ChatGPT came up with:
Here’s DALL-E 3’s result.
And here’s MidJourney’s result.
While MidJourmey’s output is very pleasing to the eye, it’s not at all what we asked for, so DALL-E 3 pips it here. However, there’s still plenty of nonsensical text in the image. In my testing, DALL-E works great when you specify all the text in the image, or there’s no other text than what you asked for, but if the image has unspecified text it’s nonsense just as with MidJourney,
Prompt 3: Setting a Scene
The last test I want to run is setting a scene, where I specify the position of all the major elements.
And here are all four attempts by MidJourney.
Again, MidJourney excels at artistic flair but completely fails to actually do what I asked in the prompt.
While you can redo the same image in DALL-E 3 in different styles, no amount of cajoling will get MidJourney to consistently reproduce the specific elements and placement you ask for. Here’s the same image, but I asked for a more surreal and dreamlike style from DALL-E 3.
DALL-E 3 Isn’t Perfect
Before you decide to ditch MidJourney for DALL-E 3, there are a few major limitations I ran into when testing DALL-E 3 that you should know about:
My time with the tool was limited, and both DALL-E 3 and MidJourney are constantly getting new tweaks and features, but these were the most apparent limitations that most people might care about.
The Verdict
It’s quite difficult to declare an absolute winner here, but as things stand, MidJourney is the right tool to use if you want expressiveness and artistic flair in what you generate. In contrast, DALL-E 3 is by far the better tool if you want to create consistent artwork to your exact requirements for illustrations or other professional use cases.