Licensing in the Age of AI

2 min read

Web crawlers have been around since the beginning of Google's revolutionary search feature. The internet became searchable and widely more usable. This is a good thing. The bad is the enshittification with ads on those results and prioritizing only to those who participate in the made-up industry of search engine optimization.

Now, we have AI that is endlessly hungry for data. Crawlers, scrapers, whatever you may call it, they are coming for our content. Content is consumed during the training process of some of these models. Due to the closed-source nature of many of these commercial models, we will never know what really went into them. Certainly many have consumed code licensed under GNU AGPL v3 which is a notorious "poison-pill" license which forces derivative works to be permissively licensed. On the creative side, certainly works under Creative Commons' No Commercial, Share Alike, and No Derivatives licenses have also been consumed. Surely the AGPL v3 and the Share Alike version should have forced something out of these closed models right?

This all comes down to "what is fair use?" at least in the US. We don't seem to where to draw the line and we probably won't until the many courts of the world come to some kind of consensus. Or maybe we can go the patchwork route and let some countries just be the AI-haven of the world, much like Ireland has been the corporate tax haven thanks to the "Double Irish".

I haven't even talked about actual copyrighted material which is apparently OK to use. Had those cases swung the other way, Meta and Anthropic would have owed monetary damages.

I am no lawyer, and I am no true "artist" in the sense, but we ought to be protecting the creatives who create works that inspire us and motivate us to do something more than just make money. AI has a place in our world, but in this big shift, we should be equitable about it. We should to put it simply, play nice with one another.

As I have been thinking about licensing, I have added the CC-BY-NC-SA 4.0 license to the content of my blog. The contents are always written by me except where noted. The code itself for my blog was "vibe-coded" as a weekend project to get myself off Hugo and onto something I "owned." That said, I recognize that the LLM used to do the coding has ~probably~ definitely consumed AGPL-licensed code, so it is only right that my source code also be under an open license. My source was already on the MIT license, but I have clarified that in the footer and README of the source repo. I have also clarified that my posts which are written in Markdown in my source are also under CC-BY-NC-SA 4.0.

While licensing can sound boring, I encourage everyone to read about open source licensing. GitHub has a pretty good guide. TLDRLegal has pretty simple summaries of rights of licenses and various other legal agreements.

Webmentions

Loading...