Newsletter

How Generative AI's intellectual property issues could threaten freedom of information

Are companies panicking over AI-generated content? Bans, lawsuits & copyright claims are rising, but is the fear of unregulated content justified? Explore how claims on intellectual property in generative AI could limit Internet's freedom of information

Pierre-Paul Ferland

10 May 2023 • 7 min read

What's the first song you downloaded online? For me, as a teenager in the early 2000s, it was Pennywise's "Bro Hymn", off of Napster. We kept getting told it was illegal to download on Napster. We didn't care. My local record store only carried Céline Dion's discs. I couldn't have gotten my hands on a Pennywise record even if I had wanted to.

I believe this is the state of mind data scientists were in when they built generative AI tools. Intellectual property laws? Privacy laws? Who cares? Go fast and break stuff, right?

Turns out that lots of people do care, and the consequences will be awful. I believe the backlash against generative AI will create worse access to information for individuals due to risk aversion.

To build their large models, data scientists scraped the public internet recklessly. Now, the train has left the station. That won't stop anybody who feels their pockets are about to get lighter from reacting. The consumers (us) will be worse for it.

This week, I look at some disturbing trends that you should be aware of, and at a potential solution:

the risk aversion conundrum
the "Great API Closure"
art and copyright breaking off from each other
the EU's AI Act's potential

Asking for forgiveness before asking permission feels like the only way to go

I admit, I was quick to jump the gun on Google's fate after ChatGPT went viral. While I still stand by many claims in my article, I was harsh on Google's risk aversion. I realize I may be part of the problem I want to discuss today.

I'm sure Google's assessment teams foresaw the EU authorities' bans, the intellectual property class action lawsuits, and the jailbreaks of its language models. And there is a way to do it right! But it takes time, effort, negotiations, and giving away pieces of the pie.

Remember Spotify? This is the company that succeeded in doing what Napster should have done. Spotify got the desperate record companies on board by giving them part ownership of the company.

Then there is the other way.

You do it fast, and you manage to get too big to fail when the pendulum comes back with the lawsuits and regulators. Call it the Facebook way.

Here's the conundrum: a small company like OpenAI could not afford the hard way, especially since the US Court of Appeals confirmed web scraping of publicly available information is legal.

So what happens if you are a company that feels slighted by ChatGPT? Well...

We're going to witness an internet closure

The most worrying trend for me has now begun. First, it was Twitter. Elon Musk has been goofing with Twitter for the past 6+ months, but paywalling its API may become the web's next big thing. Reddit recently made headlines, claiming "commercial usage" of its content will become a paid service. Developer forum Stackoverflow is following suit. Any add-on, client, extension and of course, AI model will need to share revenues.

Suddenly, any publicly available information has become a million times more valuable. In the near future, you will need a login to access any website.

The user experience will be worse than cookie banners. As a security specialist, I'm worried about password fatigue and password reuse already. The arms race will likely keep escalating. Many websites will not require logins and implement weird CAPTCHAs to block AI scrapers, thus adding friction to users' experiences.

Companies that are threatened the most by generative AI are subject to the psychological concept of "loss aversion" (a.k.a. losses have more sensitivity than an equivalent gain). They will react disproportionately and users will be in the crossfire.

Speaking of overreacting...

Copyright absolutism will have its day

Art transcends copyright. As a matter of fact, copyright is a very recent invention from the industrial revolution. Before, artists would earn their keep through religious affiliations or patronage. Travelling troops would rehearse known stories passed through oral tradition. Nobody owned the rights to the Knights of the Round Table. Homer didn't claim jack on Odysseus. Even Shakespeare heavily borrowed from history, mythology, and other playwrights.

Copyright is a business problem, not an art problem. This matters. Generative AI images, music and films will unleash an unprecedented wave of creativity on humanity. People, like me, with average dexterity can finally give life to their imagination. The toothpaste is out, and however any copyright lawsuits end up, AI art will keep existing.

Let me illustrate. A viral generated Drake song blew up on TikTok and record companies lost their minds. They didn't learn anything from Napster. Read Universal's notice about the artist, you'd believe they're the scum of the earth. Meanwhile, a teeming hive of AI artists is growing on Discord, and Canadian songwriter Grimes is offering a 50% cut of whoever wants to license her voice.

Still, having creators earning their fair share for their labour remains a relevant question, at least in the short term. Getty Images, which have been made obsolete by generative AI, are suing Stability AI for their lives. Photographers are filing a class action lawsuit against MidJourney as well, and the US copyright office ruled people could not copyright an AI image they generated.

I dread a witchhunt of AI artists. Spotify was able to shepherd the record companies because they formed an oligopoly that owned the most popular music, so perhaps a new "industry-owned" AI generator could usher in. There is no such thing in images. Could we suffer copyright absolutists shutting down AI image generators one by one? It's hard to blame data scientists from "asking for forgiveness" when permission is associated with dinosaurs who see AI art as some type of prolongation of 19th-century industrial capitalism.

An AI Act could rescue a free internet

The EU seems to have struck a potential balance with its AI Act which would classify models based on risk and carry transparency requirements, instead of banning powers.

I am carefully optimistic. However, the proposal comes from the same agencies who gave us cookie banners and an IP address being considered personal information, so lawmakers still have the time to misunderstand technology enough to fumble this.

Based on the transparency principle, I think a convenient way to fend off scrapers would simply be to add a mandatory "opt-out" of training in a website's metadata, for example in a robots.txt file. It would be easy for a website to audit an AI company's compliance with the opt-out, given the transparency requirement. This would work better than logins, CAPTCHAs and other patterns to catch AI robots. API monetization is a path to walled gardens, which is not in the interest of most websites: they need to be found after all!

Art carries the same potential. The next generation of image generators should offer artists who opt-in a share of the subscription revenue whenever an image explicitly requires the artists' style.

In the end, while the profit share model of the AI giants does matter a lot, I prefer to think about generative AI from the perspective of a user, a builder, or a creator. I wish the incumbent would keep us in mind because we can do great things when given freedom of information.

🥊 Latest In Tech

Privacy and Cybersecurity

Google introducing a passwordless solution. I explained how phishing and password attacks count for up to 90% of security breaches. Passwords are an unreliable authentication method. Google replaces them with a secure "key" in a biometrics-protected phone, hence the term "passwordless". The technology itself is limited to Google services only. The danger is losing your phone, of course! I prefer Microsoft's passwordless solution, which is embedded in the Microsoft Authenticator app, backed up in Microsoft's cloud, and therefore not strongly associated with a device. Story

Business of Tech

Apple's coaching service is the virtual assistant you've dreamed of. You know what you should not start these days? A journaling app. Apple will add journaling to its Health App, with the possibility to tie your outputs with health recommendations. Apple gathers so much data about our daily habits, they are uniquely positioned to deliver on this "iLife" assistant Story.
Siri is a failure... according to Apple employees themselves. The funny thing: I had to scrap the previous story's description of Apple's health app because I was ecstatic about Apple's commitment to its products and seeing it not let itself get taken away by ChatGPT's virality. Nope. Turns out Apple is not so different, as Siri seems to have gotten an internal kiss of death at Apple due to its incapacity to process human language sufficiently. Story

Artificial Intelligence

Palantir demos AI-powered war planners. When we say AI will make everybody more productive, we mean it! Drone strikes were already an ethical concern, now imagine letting an AI command a deployment strategy! How do you train an AI like that? My only guess is turning it into a video game and having actual generals playing fake missions. Story
Google believes Open Source AI will triumph. In a leaked document that went viral, Google researchers claim that further iterations of large language models will need less expensive hardware. Google is questioning how it will keep a competitive advantage. Story

❓ Question of the Week

Do you believe a prompt can be classified as art?

🥳

Thank you for reading!

If you like my content, subscribe to the newsletter with the form below.

Cheers,
PP