More on Copyright and AI by Kelley Way
Let’s welcome back monthly columnist Kelley Way as she shares with us “More on Copyright and AI.” Enjoy!
***
How Your Content Might Find Itself in a Training Database
I recently attended a writer’s conference, and a hot topic among attendees was copyrights and AI.
However, they weren’t asking about the copyright-ability of AI-generated content or whether AI-generated content might contain someone else’s copyrightable material.
Their big concern was whether their own content was part of an AI training database or whether their content would or could end up in an AI training database.
It’s an excellent question, so let’s cover how this might happen.
1. Your Content Was Collected into a Data Aggregator
Some companies collect data from several different sources and pool all this data into a big database.
They then sell this data to other companies.
Some AI companies (not all, mind you) purchased licenses to copy all this data into their training databases.
So if your content made it into one of these databases, it would have been incorporated into the AI program.
There are places online that will tell you which databases were copied by which AI companies; you may even be able to tell if your content is in a specific database.
2. Your Content Was Added to an AI Training Database
This is most likely to occur when you are hired to write something, e.g., you’re paid money to write a blog or magazine article, but it could also happen with a publisher.
Essentially, the company takes your content and adds it to an AI training database to further their own company goals.
To combat this practice, the Author’s Guild encourages all writers to have a provision in their contracts stating that their content cannot be used to train an AI program.
There are some model provisions on the Author’s Guild website that can be used for this purpose (If you’re concerned about asking for this provision, many companies now include a provision stating that the author cannot use AI to write content for them, so asking for this in return would be a fair exchange).
3. You Give Your Content to an AI Training Database
Some AI companies add all user inputs into their training database, so every time you ask the program a question or give it some content to critique, you are giving it more content to train on.
This is not true of all AI programs, and a little bit of due diligence will tell you whether a specific program does this or not.
Keep in mind that different AI companies have different policies.
Some AI companies built their databases using only their own content, and others have shown varying levels of care in what content they use for training their programs.
With the legal landscape shifting against the use of copyrighted content to train AI, more companies are adding ways to “opt out” of having your content in their program.
Change is coming quickly, so stay tuned for further updates.
To Learn More about Copyright and AI
In the meantime, if you want to learn more about copyright and AI, you are welcome to email me at kaway@kawaylaw.com.
***
Want to read more articles like this one Writer’s Fun Zone? Subscribe here.
***
ABOUT THE AUTHOR
Kelley Way was born and raised in Walnut Creek, California. She graduated from UC Davis with a B.A. in English, followed by a Juris Doctorate. Kelley is a member of the California Bar, and an aspiring writer of young adult fantasy novels. More information at kawaylaw.com.





Really thoughtful breakdown, especially the practical scenarios writers actually worry about. These conversations matter more now than ever. I’ve been seeing similar transparency trends lately, even in analytics tools like Usermaven, which feels encouraging overall.
An AI tool like Contentpen ***