- cross-posted to:
- freeebooks@sh.itjust.works
- cross-posted to:
- freeebooks@sh.itjust.works
An activist group has claimed to have scraped millions of tracks from Spotify and is preparing to release them online.
Observers said the apparent leak could boost AI companies looking for material to develop their technology.
A group called Anna’s Archive said it had scraped 86m music files from Spotify and 256m rows of metadata such as artist and album names. Spotify, which hosts more than 100m tracks, confirmed that the leak did not represent its entire inventory.
The Stockholm-based company, which has more than 700 million users worldwide, said it had “identified and disabled the nefarious user accounts that engaged in unlawful scraping”.
“An investigation into unauthorised access identified that a third party scraped public metadata and used illicit tactics to circumvent DRM [digital rights management] to access some of the platform’s audio files,” said Spotify.
Spotify does not believe the music taken by Anna’s Archive has been released yet. Anna’s Archive, which is known for providing links to pirated books, said in a blog it wanted to create a “‘preservation archive’ for music”.
The group claimed the audio files represented 99.6% of all music listened to by Spotify users and would be shared via “torrents”, a means of sharing large digital files online.
“Of course Spotify doesn’t have all the music in the world, but it’s a great start,” said Anna’s Archive, which describes its mission as “preserving humanity’s knowledge and culture”.
“With your help, humanity’s musical heritage will be forever protected from destruction by natural disasters, wars, budget cuts and other catastrophes,” said the group.
All that lossy-ass bitrate music that sounds like a telephone underwater, shame
The funny thing from what I’ve read the got the alot of raw audio files too so the people torrent probably are getting higher quality versions then what Spotify transcodes too
160kbps OPUS is okay. At that point the biases in the ADC and recording equipment matter more.
The same Anna’s Archive that allows free anonymous downloads that are throttled to the speed of a 1990-era modem unless you pay?
Yes, I’m sure preservation and social good is their goal. Definitely not about making money.
Any idea what it costs to reliably store this data, let alone have the bandwidth to upload it to others?
This ain’t a cheap game, no matter what the intentions are. I have no problem with paid content because it costs money to have it there. I pay Spotify but I’d rather pay Anna’s archive
Not really on topic, but I recently swapped from Spotify to Qobuz and it’s been a pretty good experience if you’re interested in paying someone who’s not Spotify
I’m just some random on the internet but I’d rather you didn’t pay Spotify. Here’s one example for why, among many.
In November 2021, Prima Materia, an investment company cofounded by Ek and Spotify investor Shakil Khan, was announced as the lead investor in a €100 million fundraising round for Helsing, a European defense company that develops military strike drones and AI systems. Ek also joined Helsing’s board along with its co-founders Torsten Reil, Gundbert Scherf and Niklas Köhler.[171] In 2025, Ek’s firm increased its investment in Helsing, leading a €600 million funding round in the company that valued it at €12 billion.[172]
Several artists have objected to Ek’s investments in defense technology. The German DJ and techno producer Skee Mask urged his nearly 17,000 Twitter followers not to give their “last penny to such a wealthy business that obviously prefers the development of warfare instead of actual progression in the music business.” According to Sameer Gupta, a percussionist based in Brooklyn, New York, “All that money that’s being taken from artists and musicians is being funneled to this,” referring to Helsing. “I don’t know a single musician who would ever say, ‘That’s the function of music.’”[173] In 2025, the bands Deerhoof, Xiu Xiu, King Gizzard & the Lizard Wizard, Godspeed You! Black Emperor, Hotline TNT, Massive Attack, and Sylvan Esso removed their music from the streaming service over objections to Ek’s investment in Helsing.[174][175][176][177][178]
deleted by creator
My thoughts, too. Now, there will be a court case, and Anna’s will be shut down. Because, in court, money almost always wins.
It’s kind of ironic that a preservation focused organization didn’t have any sense of self-preservation. If they quietly scraped and archived songs over time and in the background, there never would have been any attention brought on them.
Clandestine organizations are hard to keep a tight rein on. All it takes is one deviant to sway others and start a foolish snowball effect…
Why should they hide if AI companies are doing it too and even profiting from it without repercussions?
AI companies have rich connected evil people backing them these people don’t
I’m backing them. I lost the code to my swiss bank accounts, tho.
I know a Nigerian prince who can help you with that
AI companies train their software on copyrighted content. They don’t spread direct copies of that content around.
Seems unlikely that they will be successful in shutting it down. If that were the case, they would have been shut down over the books already.
They have lost a dozen court cases in as many countries. They are still up.
Ya this is sure the beginning of the end for them. They aren’t an “AI” company so the full force of the government will come after them now that they have been named in a mainstream publication.
They’re decentralized, though. Hammer them down and a mirror will pop right up. Clearly they are also willing to work with places that are out of reach of Western Copyright law as well, given their prior interactions with Deepseek’s development.
TIL they are decentralized and that does make keeping them offline harder, but does make issues like honeypots and malicious mirrors more likely as sites come and go.
… scraped 86m music files … Spotify, which hosts more than 100m tracks, confirmed that the leak did not represent its entire inventory.
“Neener neener, you only got 86%!”
“It could have been worse.”
“They didn’t get everything, so we win!”
LLMs will at least be well trained in Newspeak.
According to the other article I read Spotify has about 256 million tracks. Not sure why guardian is saying over 100 million.
Technically true
In the same way it’s true that elon musk is a millionaire I guess lol.
Technically correct is the best kind of correct
We only need 300TB iPod and we are done.
Is Anna’s located in the US?
Not located anywhere- the archive is mirrored all over the world.
Theoretically a node could be (since Anna’s is decentralized and not consolidated), but in practice I think it’s reasonable to believe none exists. The website’s just accessible by US internet users and hosted somewhere outside the DMCA’s grip.
That’s what I was wondering about. If the American government can get to them.
Probably not, but doesn’t stop them from issuing the takedown request, I suppose.
If the servers are is China or Russia. I guess the takedowns are useless.










