Your Music May Be in These AI Training Datasets. Here Is How to Check.
The Atlantic published four searchable databases covering 21 million tracks used to train AI music models. Any artist can search right now to see if their recordings were scraped without permission.
Short answer
On June 21, 2026, The Atlantic released four searchable databases identifying music used to train generative AI music platforms. The databases cover approximately 21.2 million tracks, including music from major artists and tens of thousands of independent musicians. Any artist can search by song title, artist name, or ISRC to see whether their recordings appear. A Nashville musician searching the tool on June 22 found 71 percent of his discography in the dataset. The presence of a recording in the database does not by itself prove infringement, but it gives artists concrete evidence relevant to the ongoing copyright cases against Suno, Udio, and Google over their Lyria 3 model.
Key takeaways
- On June 21, 2026, The Atlantic published four searchable databases identifying music used to train AI music models. Combined, they cover approximately 21.2 million tracks.
- Any artist can search by song title, artist name, or ISRC to see whether their recordings appear. The tool is free and requires no account.
- Appearing in the database does not prove infringement by itself, but it gives artists concrete evidence that an AI company had access to their work.
- The databases are directly relevant to the active copyright cases against Suno, Udio, and Google over its Lyria 3 model.
What happened?
The Atlantic reporter Alex Reisner published four searchable databases on June 21, 2026, identifying music used to train generative AI music platforms. Two of the datasets each hold about 100,000 recordings. The other two are far larger: one contains approximately 9 million tracks, the other roughly 12 million.
The databases include recordings from major artists alongside tens of thousands of lesser-known independent musicians. A Nashville-based independent artist searching the tool on June 22 found that 71 percent of his discography, 48 of his songs, was present in the dataset. He told local news he had invested more than 100 hours per song.
Why this matters for independent artists
Before this database existed, an artist who suspected their music had been scraped had no practical way to check. Suspicion is not evidence. What the Atlantic tool gives you is a specific, searchable record of which AI companies had access to which tracks.
That is a different thing from proof of infringement. The major labels and the Recording Academy have been building their copyright cases against Suno and Udio (filed June 2024) without this kind of direct lookup tool. Independent artists behind the class action against Google over Lyria 3 filed in March 2026 face the same challenge: showing that a specific work was used, not just that large amounts of copyrighted music were swept in. A database entry is a starting point for that argument, not a concluded case.
The databases change that by making the sources searchable. For artists and labels, the ability to confirm specific usage patterns changes the legal landscape.
For most independent artists, the practical value right now is knowing whether your work appeared. That knowledge informs what steps you can take if enforcement paths become clearer.
How to check if your music is in the databases
Search The Atlantic tool
The Atlantic published its searchable databases alongside the investigation. You can search by artist name, song title, or ISRC. No account is required. If you find your work, note the dataset name and the specific track information so you have a record.
What to do with a result
Finding your recording in the database does not mean you should immediately take legal action. It means you have evidence worth preserving. If you are already in contact with a music rights attorney or a collective action related to AI training, share what you found. If not, keep a record of what appeared and check back as the legal landscape develops.
Your ISRC is the most precise search term. If you do not know the ISRC for a recording, your distributor or publishing administrator should have it on file, and it should appear on any streaming platform listing for the track.
What is still unclear?
Open questions
The databases show what was available and shared among AI developers, not necessarily what was ingested and used for training. AI companies can argue their model was not trained on a specific track even if that track appears in a shared dataset. How courts weigh that distinction is still being worked out in the active cases against Suno, Udio, and Google. There is also the question of what legal options are available to independent artists who are not part of the existing class actions and whose music appears in the databases. That is not settled. What is clear is that the tool gives you a factual starting point that did not exist a week ago.
Sources
Related Velveteen guides
Get music industry updates without the noise
Short notes on platform changes, royalty issues, and release marketing moves that actually affect independent artists.
Was this useful? Send a signal or flag a correction.