The present wave of generative AI animation usually seems like a magic trick that solely works as soon as. You kind in a immediate, a video seems, and if you happen to do not just like the outcome — possibly the toes are all wonky, which is an everyday difficulty with AI generations — your solely actual choice is to strive a distinct immediate. This “black field” method is precisely what Cartwheel, a brand new 3D animation startup, is attempting to dismantle.
Andrew Carr and Jonathan Jarvis, two veterans with roots at OpenAI and Google, respectively, based the corporate, which is working to construct a future the place AI handles the technical drudgery of animation whereas leaving the artistic soul to the artist.
I spoke with Carr and Jarvis about launching their firm, defining “style” with AI, and the technical and artistic difficulties of animation in 2026.
What units Cartwheel aside
In keeping with the founders, one of many greatest hurdles on this area is that 3D movement information is remarkably scarce in comparison with the limitless oceans of textual content and pictures obtainable on-line that AI fashions are skilled on.
“If you happen to take a look at all the large tech corporations, they’ve constructed their fashions on written language, audio, picture, [and] video as a result of there’s simply a lot of it, so discovering these patterns is way simpler,” Jarvis mentioned. “We knew it was going to be laborious, but it surely seems to be more durable than we thought by in all probability an element of 10 or 100 to get that information.”
Learn extra: Generative AI in Gaming Is Right here, however Going through Pushback From Avid gamers — and Builders
Whereas different tech giants deal with producing ultimate pixels, Cartwheel has spent years mapping how people truly transfer. Their fashions are constructed to know the nuances of a efficiency so {that a} easy 2D video of somebody dancing of their yard might be translated right into a exact, reasonable 3D skeleton.
This shift from flat photos to 3D property is what offers animators the management they’ve been lacking within the AI period.
Cartwheel has spent years tackling the troublesome activity of mapping how people truly transfer.
Stopping AI “sameness”
Cartwheel’s executives mentioned they view AI’s “sameness” as a byproduct of an absence of management. If everybody makes use of the identical generator to provide a video, the outcomes could finally begin to look all too related.
“The output of our system is designed for individuals to edit. It is designed for individuals to the touch and manipulate, and we do not need somebody to kind one thing in after which have it shuffle by way of to a completed animation. That is not the purpose of it. That is boring, who’s going to look at that?” Carr mentioned.
“The truth that it’s extremely straightforward for individuals to get into it and edit it truly completely removes the sameness downside,” he mentioned. “You set it on completely different characters, you set it in numerous environments, you modify the way it seems, you push the efficiency, you pull the efficiency, and in that sense [sameness] turns right into a nonissue.”
Carr and Jarvis mentioned the answer is to offer a “management layer” the place the AI output is simply the start line. By producing 3D information as a substitute of flat video, the creator can change the lighting, transfer the digicam or regulate a personality’s pose after the AI has finished its preliminary work — making the expertise a complicated energy instrument moderately than a alternative for the artist.
Founder Andrew Carr mentioned one among his core scientific hypotheses is that motion and movement is a basic information kind.
The way forward for animation with AI
Past simply making animation sooner and decreasing the barrier to entry, the corporate is wanting towards an idea they name “open-ended storytelling” or “open-ended world-building.” In fashionable gaming and social media, the demand for content material has reached a scale that guide animation can’t presumably match.
Cartwheel envisions characters that are not simply programmed with a couple of set strikes however are powered by movement fashions that permit them to react and carry out in actual time. It is much less about choreographing each single body and extra about “rehearsing” with a digital actor that understands the intent of the scene.
Finally, the purpose is to bridge the hole between 2D imaginative and prescient and 3D execution, mentioned the founders.
“One of many core hypotheses that we hope is true within the subsequent three years for Cartwheel is everybody will work in 3D even when it is authored in 2D, even when the ultimate output is simply 2D video,” Carr mentioned.
By specializing in the “layer beneath the pixels,” Carr and Jarvis mentioned they hope that as animation turns into extra automated, it additionally turns into extra private. The machine handles the biomechanics and the file exports, however the human retains the ultimate say on the style, the timing and the center of the story.

