If you're going to make an AI music video, you better have a good reason why
Washed Out's "The Hardest Part" is a lesson for all generative artists.
To call generative AI controversial would be putting it mildly. As impressive as the technology can be, there are major questions around how these models were trained, and with whose (uncredited) work—especially this week. But while we contend with the ethics, artists are still finding ways to employ and experiment with these tools to see how it can be used to tell stories never seen before.
Earlier this month, chillwave artist Ernest Greene, who makes music under the name Washed Out, and director Paul Trillo released a music video for the song “Hardest Part.” It’s being marketed as the first “official” music video made using Sora, OpenAI’s text-to-video generator. The backlash was unsurprisingly swift, but what struck me was how even some of the more diehard, pro-gen AI communities also reacted negatively—not to the tech, mind you, but to structure and narrative of the video itself.
And that’s why I think “Hardest Part” failed to break away from the controversies around generative AI: Take away the technology angle, and you’re left with a maddeningly generic concept that does nothing to justify its use of the tool. It’s a hard-earned lesson that, even at the cutting edge of innovation, having a good story matters.
Music videos have long been at the forefront of pushing the boundaries of visual expressions. It’s no coincidence that some of the most surreal movie directors got their start making some of the wildest music videos of their generation. I vividly remember watching and rewatching DVD sets collecting music videos by the likes of Spike Jonze and Michel Gondry, which seem silly now in the era of YouTube but were so enthralling to me during my grade school years. A much more recent example is The Daniels: We may never have gotten the trippy-yet-heartwrenching Everything, Everywhere, All At Once if it weren’t for DJ Snake and Lil Jon.
In all these cases, the directors and the artists gave us something we had never seen before, and they did so in a way that evoked a narrative or an idea. It changes our relationship with the original song and inspires new ways to see the world.
If given a tool that can transport you to a world of our imaginations, why stick to grocery stores and grade school hallways?
“Hardest Part” does none of that. The video’s use of technology may be novel, but the story it tells—a dizzying and fast-paced look at a couple’s journey from meet-cute to family to end of life—has been done just as well, if not better, using more traditional techniques like physical sets and human actors. Put another way: If given a tool that can transport you to a world of our imaginations, why stick to grocery stores and grade school hallways?
The imperfections, the distortions, and the uncanny sheen all present here have all become hallmarks of generative work. It evokes a sense of unease and gives the viewer a sense that something feels very off just under the surface. TV shows like Marvel’s Secret Invasion leaned into that unmistakable discomfort to great effect for the opening credits. Here, however, that discomfort feels less like authorial intent and more a distracting reminder of the technology being used.
Throughout this, I’m reminded of a Kendrick Lamar music video from several years back. “The Heart Part 5” also uses very new, very controversial technology—in this case celebrity deepfakes—but does so in a way that enhances the message of the song. It’s worth watching in full but if you’re short on time, start at 1:40.
“Part 5” is a song that opens with Kendrick discussing the notion of learning perspective, and as the song and video progress, he takes on the perspectives of others in a very literal sense, wearing the face of other famous/infamous Black men including OJ Simpson, Kanye West, Jussie Smollett, Will Smith, Kobe Bryant and Nipsey Hussle. It is disconcertingly convincing, even now, but it’s also the message that Kendrick is conveying here.
As Pitchfork’s Marc Hogan wrote at the time, “the groundbreaking visual opens up a new world of creative potential for the problematic AI tool.” Because for all the concerns around new technology, if its use is justified, then audiences tend to forgive or even celebrate the process behind it.
Funny thing is, this isn’t even the first time a music video has adopted this concept. As James pointed out to me, Godley and Creme’s 1985 music video “Cry” employed a similar idea of blending faces to convey perspective, only then it was through the use of dissolve and wipe effects. It’s rougher, to be sure, but you can see the same thought process at work: start with the idea, then figure out the best technology to make it happen.
Speaking to Rolling Stone about the backlash, Greene likened the experiment to Dire Strait’s “Money for Nothing,” which in 1985 was one of the first times CGI was seen on television—and which has long been held up as one of the best music videos of the time. Claire Shaffer wrote a great piece on its history for Vice. I think it’s worth revisiting the intro here:
Gimmicks are catnip to an advertiser, something that awakens the consumer’s mind with a new stimulus (3D glasses, a museum full of ice cream) while enticing eyes towards a branded image or hands towards a wallet. When done creatively, gimmicks can either resemble an art form or become one—the entire basis for the modern music video.
Dire Straits’ use of CGI was a gimmick, but that was the point: A song that lyrically satirized the use of gimmicks and the corporatization of the music industry paired with a video that checked all those boxes. It gave audiences something they had never seen before, but it was more than just fancy special effects.
I wonder, would the reaction to Greene and Trillo’s “Hardest Part” have been just as negative had the concept been more ambitious and, frankly, more intentionally weird? Trillo is talented at bending generative models like Sora to his will, as reflected in his more surreal experiments, which are very interesting and indicative of the kind of visuals that would be difficult to make without generative tools. But we need to match technological ambition with artistic ambition, and when it comes to music videos, audiences expect more.
AI is not a panacea for creative expression. You can’t just type words and make great art. It’s not going to create empathy. It’s not going to forgive a bad idea on gimmick alone. Great movies with lots of spectacle have failed just as often as movies with limited budgets have become behemoths.
There’s nothing wrong with a gimmick when paired with the right idea, and generative AI will be no exception. But if you’re going to play the card, you better have a good reason for doing so.