When AI training data gets laundered (i.e. stolen): The Week in Review
Also this week: Samsung's GenAI, Emmy Nominations, and jumping in Elden Ring.
This week saw yet another development in the ongoing “move fast and take things” saga when Proof News co-published with WIRED an investigation into the use of YouTube videos to train AI—without the creator’s awareness or permission. From the report: “Our investigation found that subtitles from 173,536 YouTube videos, siphoned from more than 48,000 channels, were used by Silicon Valley heavyweights, including Anthropic, Nvidia, Apple, and Salesforce.”
At this point we think it’ll be helpful to stop with “swipe,” “scrape”, “siphoned” euphemisms and use the word that most accurately describes this for what it is: stolen. As The Verge pointed out in their follow-up article, “YouTube has said using creators’ content to train AI systems would violate its terms of service—so what happens if they did?” What indeed?
Responses to the Proof News/WIRED investigation have been both predictable and fully justified. Video journalist Joss Fong on Threads found some of her own work in the database and wrote "welp there it is 10 years of our work... into the feedstock.” Marques Brownlee, aka MKBHD, called it out exactly for what it was: "AI is stealing my videos." As alums of The Verge Polygon, we found around 250 videos across the two outlets (41 from Polygon and 203 from The Verge) had their subtitles stolen—and if you’re curious about your own work, or that of your favorite creator, there’s a handy tool to find out which videos are in the data.
But, as MKBHD pointed out, Apple and the other beneficiaries of this dataset didn’t technically steal it; that was done by a third party company who then sold the data to these more front-facing companies. When Figma was recently accused of copying the Apple weather app in their Make Design feature, its CEO Dylan Field blamed it on the training data provided by a third party. The “we didn’t steal it, honest” defense.
So what does this mean beyond the tech companies doing yet more shitty things when it comes to AI? We’re not sure yet. But we’re watching this develop and it does make us wonder whether we’re nearing a “Napster moment,” when musicians finally stood their ground and called out the company for stealing their music. If, as Mark Twain said “history doesn't repeat itself, but it often rhymes” we’re wondering who’ll be the Lars Ulrich of the creator community to stand up as the biggest voice against AI.
Lars Ulrich, for those who don’t know, is the drummer and co-founder of Metallica, arguably one of the most successful metal bands in the world. In addition to being the drummer, Ulrich is also regarded as the most business-savvy member of the band. In 2000 he filed a lawsuit against Napster for copyright infringement and racketeering. But he wasn’t done.
In July 2000, Ulrich, (along with Roger McGuinn of The Byrds) testified before the Senate Judiciary Committee during which he said "I do not have a problem with any artists voluntarily distributing his or her songs through any means that artist so chooses. But just like a carpenter who crafts a table gets to decide whether he wants to keep it, sell or give it away, shouldn't we have the same options?"
The case was settled out of court and, well the rest as they say is history: Napster went away. (And then returned, kinda sorta, in name only, as a small-time music streaming service.) Will the same happen with OpenAI, Anthropic, etc? That’s still unclear. But The New York Times has filed a lawsuit that is still in progress and perhaps there will be others, we’ll have to wait and see.
And while we wait, here are some of the things we’ve read, watched, played and genuinely enjoyed or found inspiring over the past week
What We’ve Read
Samsung’s new image-generating AI tool is a little too good (The Verge). I've always loved the look of Samsung's new Galaxy Z Fold and Flip phones, but for the new 6th generation, it's a software feature I'm most fascinated by. Sketch to Image does exactly what the name suggests, turning your rough drawings to something more convincing with generative AI. The tool seems to work both as a blank canvas—not unlike Apple Intelligence's forthcoming Image Wand, it seems—but also as a way to augment existing photos. The generative-augmented photos that The Verge’s Allison Johnson shows here are a lot more uncanny than I think we're used to, and certainly more than we’re gonna be ready to handle as a society in the midst of a misinformation crisis. I'll admit, my first reaction was fascination before dread, but it's yet another example that we are very quickly hitting a point where every photo, even the most authentic-looking ones, should be observed from a place of doubt first and foremost. Which has maybe always been the case, but I don’t think we’re ready for how ubiquitous the tools are becoming. —Ross
Experiment finds AI boosts creativity individually — but lowers it collectively (TechCrunch) Researchers at the University College London and University of Exeter in the UK looked into how using AI helped different groups to write creative short stories. The findings suggested that while using AI helped those people less experienced in writing creatively it made little difference to those writers more familiar with creative writing. But the most interesting finding is the impact of using AI to write as a group: their stories became less unique and more similar. Just as Generative AI images increasingly look more and more AI-like, this research suggests the same thing happens with using AI to write stories.—James
The making of Eno, the first generative feature film (The Verge) We describe the MBH4H brand as “human-powered AI” but the film Eno by director Gary Hustwit and his partner Brendan Dawes may be a more worthy owner of the tagline. To quote directly from The Verge’s piece: “Eno is crafted through 30 hours of interviews and 500 hours of film—a curated and ethically sourced data set—with certain pieces weighted to be more likely to appear.” As a big Brian Eno fan—I listen to his instrumental music on an almost daily basis—it seems somehow apt that a documentary about the man and his work would have a somewhat technologically avant slant to it. —James
Chat GPT-4o mini (OpenAI) We spend a lot of time talking about the ethical concerns, but at the same, it’s important to follow the consumer-side of this story as well. There are a lot of free generative tools out there, and while ChatGPT is one of the dominant players, like everyone else, it’s doing everything possible to expand its user base. Chat GPT-4o will ostensibly replace GPT-3.5 for both free and premium users. Sure, it may not be as powerful as the high-end model, but the more accessible these tools are, the more normalized GenAI will become. —Ross
There's a new Deadpool Xbox controller — and it has butt cheeks (Mashable). The Deadpool & Wolverine marketing campaign has been very over-the-top, which is fitting for both the character itself and not-so-secret marketing genius Ryan Reynolds. None of which should be surprising—after all the original Deadpool marketing campaign was exhaustive enough to have its own Wikipedia page. We might say more about this next week, but for now, please enjoy the new high-low-point of an Xbox controller “designed by Deadpool,” which adapts the lower half of Reynolds and, because we’re talking about cheeky humor here, features Deadpool’s derriere protruding from the rear.—Ross
What we’ve watched
Shogun. I consider Shogun to be as close to a perfect show as it is possible to get so I am not surprised it made history with 25 Emmy nominations. All this news needs now is a suitable version of the Mariko meme, something like, Mariko: The Anjin says to give us all the Emmys. —James
All Things AppleTV+. I am convinced Apple TV+ is the new HBO, the number of banger shows on Apple’s streaming platform is extraordinary: Silo, Slow Horses, The Morning Show, Drops of God, Lessons in Chemistry, Masters of the Air, Physical, Hijack, The New Look, For All Mankind, Trying, Presumed Innocent and Sunny; the list goes on and on. With the news this week that Apple is in talks to license more old Hollywood movies perhaps Apple TV+ will become more than the new HBO, maybe it’ll be the new Criterion Collection too. —James
The Handmaid’s Tale (Hulu) Watching this series for a second time in 2024 is a very different experience from watching in 2017. It is still beautifully made but far more seriously disturbing—and was a hard enough watch the first time around. —James
Solving Crimes with Charisma. I love the Emmys and I love Emmy Award-winning series. But sometimes, in fact many nights lately, I just want to watch something entertaining and wholly unchallenging. For me, that means going back to procedural crime shows "with a twist". The past week, I've been juggling between two such TV shows: Lucifer and The Mentalist. In the former (streams on Netflix), the devil absconds from Hell to run a nightclub in Los Angeles and help the LAPD solve murders. In the latter (on Max), a former fake psychic uses his superhuman observational skills to investigate crime. Each episode focuses on a largely self-contained story with some passing references to an overarching narrative that they’ll get around to addressing, eventually, but the point is to present audiences with a crime that only very attractive investigators can solve and will do so in about 40-60 minutes each. It's great. Maybe a bit trashy, but great. —Ross
Hot Ones. Last month BuzzFeed put Hot Ones owner First We Feast up for sale for $70 million. The price does seem a tad steep but even so, I consider Hot Ones a YouTube sensation, and if you haven’t watched Idris Elba, Matt Damon, Josh Brolin, Conan O'brien Jennifer Lawrence or Lewis Hamilton I seriously don’t know what you’re doing with your life. This is not just a show about watching celebrities sweat and cry in pain, it’s watching host Sean Evans conduct some of the best on screen interviews I’ve ever seen. —James
“Can You Feel the Gwar-nergy?” It's been a transformative few months for The AV Club. A one-time dominant voice in pop culture writing, created as a non-satirical offshoot of The Onion but more recently languishing under G/O Media, was bought by Paste earlier this year. Earlier this week, The AV Club roared back with the return AV Undercover, a well-loved series where artists cover popular songs that often break out of their usual genre, for example Reggie Watts covering Van Halen and They Might Be Giants covering Chumbawumba. After a 7-year hiatus, AV Undercover is back with a killer debut, metal stalwarts Gwar covering Barbie’s "I'm Just Ken" in the most Gwar way possible. Welcome back, y'all. —Ross
What we’re never going to watch
Alien Romulus. It looks beautifully made, astonishingly good and absolutely fucking terrifying. I had nightmares after watching the trailer. There’s simply no way I would watch this film at home, let alone in a theater. Nope. Not a chance. Fuck that. —James
... I might. —Ross
What we’ve played
Elden Ring: Shadow of the Erdree. Last week I wrote that I had beaten one of the hardest bosses ever to appear in a video game. This week I went one better: I learnt how to jump across an impossible gap in the flooded section of the Shadow Keep. It is difficult to overstate how many times I tried to make this jump only to fall to my death. I was losing my mind trying to time the jump to perfection only to fail every-single-time. It literally seemed impossible. That is until I found a video that made it clear I was jumping wrong. Turns out, you have to hold down the O button AND roll your thumb on the X button at the same time if you want to make this jump (and others in the game). It is the video game equivalent of healing and toeing on a race track, i.e. using your toes on the brake pedal while using the side of your foot (or heel) on the accelerator pedal to keep the revs up. Watch this video Aytorn Senna, a master of a heeling and toeing heeling and toeing (while wearing loafers no less), driving a Honda NSX at ungodly speed around Suzuka in Japan to see what I mean. It’s been a while since I’ve done any heeling and toeing but I don’t mind, at least I can make that jump in the back of the Shadowkeep in Elden Ring. —James
Gotta Go Fast. In lieu of playing anything new this week, I spent much of my downtime catching up on Summer Games Done Quick, a twice-a-year event where players and fans gather for a week of speed runs—wildly performative gameplay where players beat games in ludicrously fast time using sometimes game-breaking exploits as if they were an intended feature—to raise money for Doctors Without Borders. This SGDQ raised over $2.5 million, and showcased some wildly good runs like a blindfolded Mario 64, a no-hit run of the ultra-hard Sekiro: Shadows Die Twice in 37 minutes flat, and my favorite: a dog beating Ken Griffey Jr.'s baseball game from the 1990s. Sure, I didn't personally play any of these, but there's a good argument to be made that watching a video game is basically playing it.