During the weekend, Elon Musk’s AI venture xAI unveiled Grok Imagine, a novel generative AI tool designed for creating images and videos. Grok Imagine is presently accessible to paid xAI subscribers within the Grok iOS and Android applications.
Musk has been advocating for the initiative on X by sharing images and videos from Grok users, including some mildly NSFW content tagged as “Spicy” by the Grok application.
AI-generated video represents an exhilarating — and rather daunting — new frontier for the AI sector. Supporters claim this technology provides artists with a fresh medium for expression while potentially lowering the costs associated with animation and filmmaking. Detractors, however, argue that AI video brings significant risks related to sexual deepfakes and misinformation.
Setting that discussion aside for now, I was curious about how Grok Imagine stacks up against xAI’s main competitors. As I have noted before, Google’s Veo 3 AI video model currently dominates the market with its impressively lifelike video capabilities. Then there’s Sora, developed by ChatGPT’s creator, OpenAI. Additionally, the well-known AI image generator Midjourney has recently launched its own generative AI video feature.
So, how does Grok Imagine measure up against its rivals? To put it bluntly, I am not impressed.
Indeed, Grok Imagine is a fresh release, and Musk recently stated on X that it “should improve every day.” However, at this point in time, it appears to fall significantly short of its competitors.
Let me illustrate my findings.
Evaluating Grok Imagine AI video versus competitors
Mashable recently discussed a viral trend involving AI videos — security camera recordings of animals bouncing on trampolines and engaging in similar activities. Therefore, I employed a straightforward prompt to evaluate Grok Imagine, Veo 3, Sora, and Midjourney: “Security camera footage of rabbits jumping on a trampoline at night.” Seems simple enough, right?
First off, it’s essential to point out a significant distinction between Veo 3 and Grok Imagine. Google’s Veo 3 model can create videos directly from a text prompt. Simply articulate the video you envision, and Veo 3 takes care of the rest. Conversely, tools like Midjourney and Grok Imagine exclusively provide text-to-image generation. Once an image is generated or uploaded, users can animate it, converting it into a short video clip. In this regard, Grok Imagine is already at a disadvantage compared to OpenAI and Google.
With those considerations in mind, let’s examine the results, which I’ve also shared on X.
I entered my test prompt into Grok, resulting in these unsatisfactory images.
I chose the least bad image and crafted this brief video:
It’s…acceptable? Somewhat mediocre, or meh, as the youth often express.
Nevertheless, it also falters when compared to other AI video tools.
As evident in the video, Google Veo 3 and Sora performed significantly better with the identical prompt:
Lastly, Midjourney, which animates images in a manner similar to xAI, succeeded in generating better images and videos, albeit after two attempts. The resulting image and video exhibit the grainy appearance typical of surveillance footage.
Audio also constitutes a major drawback for Grok Imagine. While Veo 3 can generate sound effects and coherent dialogue synchronized with the video, the audio on Grok Imagine videos is restricted to rough sound effects and nonsensical chatter.
Musk likened Grok Imagine to a contemporary Vine app, declaring on X, “Grok Imagine is optimized for the most entertaining and shareable content.”
From my initial evaluations, Grok Imagine appears optimized for producing two types of images and videos: memes and anime. If your goal is to animate memes — or create sexually suggestive videos of anime characters — then Grok Imagine should suffice, I suppose. Beyond that, I can’t say I’m particularly impressed.
There is, however, one area where Grok Imagine excels, and that’s speed. So far, it generates both images and videos considerably quicker than its competitors.
Mashable reached out to xAI, and this story will be updated if we receive a reply.
Disclosure: Ziff Davis, the parent company of Mashable, initiated a lawsuit in April against OpenAI, alleging copyright infringement related to the training and operation of its AI systems.