If this describes you, read on, because AI image generation has quietly advanced in leaps and bounds in 2025. Agency-grade images can now be created natively in the big models like ChatGPT and Gemini and it’ll be a matter of months before video follows suit.
But you need the right prompts to unlock all this goodness. In this column, we look and the best tools, tips and hacks to generate jaw-dropping images with AI.
First, tools. The big news for image generation this year was OpenAI bundling the best-of-class DALL·E 3 image generator into ChatGPT itself. You can now generate some of the very best images natively without having to mess around with bespoke AIs like Midjourney, Firefly or Canva.
This was an instant hit for Open AI, as users flocked to, for some reason, make Studio Ghibli-style images of their favourite meme. “Our GPUs are melting,” said CEO Sam Altman on X.com.
To see how good image generation has got, see the above picture. The image on the left was created with the prompt, “a photorealistic image of an astronaut riding a horse” in DALL·E 2 in 2023. The image on the right was created with the same prompt in ChatGPT o4 today.
Google soon followed suit, incorporating its next-generation Imagen 4 image generation engine into Gemini. It also recently added its awesome Veo 3 video generation AI for Pro subscribers in the US. Sadly, it’s only available for the US$250-per-month (B8,000) Ultra tier in Thailand at the moment, though this will surely change.
What about prompts? The advice here is that your prompt should define the style - photorealistic, steam-punk, impressionist - and mood, such as whimsical or hard-hitting. You can specify a colour palette and lighting. If you are creating a complex image, try splitting prompts into foreground, midground and background and the relationship between objects.
Most usefully - as this column always bangs on about - your results will improve radically if you make an initial stab at your image, then iterate, refine and repeat.
You can sharpen your prompt by adding missing details - or simply scold the AI to do better. Wharton’s Ethan Mollick proved the point in a hilarious X.com thread where he asked ChatGPT to create “the perfect butternut squash.” Unsatisfied with the first draft, he prompted five more times: “Make the squash more perfect… perfect the squash further. The squash must be perfect… more perfect… yes, now we are getting there. Perfect it further… once more, my friend.” And, incredibly, the final image was indeed a perfect squash.
What about video generation? Expect huge changes by the end of the year. We already mentioned Gemini’s Veo 3. OpenAI’s Sora is in private beta and will soon be incorporated into ChatGPT. Other standout tools include Pictory, for turning long-form text into social media-friendly clips, Synthesia for avatar-based presentation videos and Runway’s Gen-2, which excels at scene transitions and user-friendly editing. Meta’s CM3-Video and Adobe’s Firefly Video (beta) also deliver professional-grade outputs.
As with images, prompts matter: specify scene length, camera angles, lighting, and mood (“cinematic dusk,” “handheld documentary style”). Start small - 2-3 seconds of motion - then expand through iteration. Use storyboards or reference frames to guide the progression. With video, you still need to be patient: it remains resource-intensive, so plan for a few passes to achieve polished, jaw-dropping results.
Joe Smith is Founder of the AI consultancy 2Sigma Consultants. He studied AI at Imperial College Business School and is researching AI’s effects on cognition at Chulalongkorn University. He is author of The Optimized Marketer, a book on how to use AI to promote your business and yourself. Contact joe@2Sigmaconsultants.com.