Can Google Push for AI Scraping To Be Deemed ‘Fair Use’?

Google wants new copyright laws that allow AI systems to freely scrape and use online content without permission. But this controversial proposal faces big legal barriers and risks undermining creative industries once AI scraping is allowed.

What Google wants

In a recent statement to the Australian government, Google advocated that copyright systems “enable appropriate and fair use of copyrighted content to enable the training of AI models.” This would let Google ingest books, news articles, images and more to train AI bots like Bard without asking creators’ consent.

Google likely wants identical rules adopted globally. The proposal claims it would still allow “workable opt-outs” for entities preferring their data not be used. But making scraping an automatic right that must be manually blocked after-the-fact represents a major change.

Is scraping already legal?

Some legal experts argue AI training constitutes “fair use” today, meaning no law change is needed. Fair use doctrines allow limited copyrighted material usage for purposes like research or commentary without permission.

However, multiple lawsuits against Google and OpenAI contend scraping for AI training infringes creator rights. Fair use depends on factors like:

– Whether use is transformative or competitive

– Amount of work copied

– Impact on market for original

Key areas of dispute include

Direct copying

Google’s SGE search experience copies sentences verbatim from sites. This closely replicates original expression.

Market harm

Promoting AI content over original sources threatens creator traffic and revenues.

Legal hazards

Scraping factual data likely has more fair use cover than taking creative works. Copying writing styles and character details has high infringement risks.

For now, whether systematic AI scraping constitutes fair use remains legally ambiguous. The proposal to codify it as such preempts years of pending litigation.

What would need to change?

Copyright law gives creators control over reproductions and derivatives of their work. Making AI training legal by default would upend this opted-in system.

Instead of putting the onus on creators to actively prevent unauthorized uses, users would need advance permission. This “opt in” approach respects creator consent and provides them leverage in negotiations.

An opt-out regime transfers leverage to deep-pocketed tech firms. Without affirmative consent, they can claim broad rights while creators scramble to prohibit uses.

Does fair use mean just cite sources?

Google suggests clearly identifying training data sources makes scraping fair. But legal and moral obligations around copyright differ.

Citing plagiarized passages doesn’t make reproduction lawful. Detailed citations also don’t constitute sufficient creative attribution. Quotes require both inline attribution and specific links.

Even with citations, AI-generated text competing for reader attention alongside “source” material likely still harms creators commercially.

What’s driving this?

Google has clear financial incentives to secure free access to training data. But growing legal threats loom.

Word-for-word copying in products like SGE leaves Google vulnerable to copyright lawsuits. Generating stories with copyrighted characters also risks infringement claims.

Securing explicit legal rights to scrape data would powerfully shield Google’s AI capabilities from litigation. But this would come at the expense of creator rights.

Before considering such changes, governments should carefully weigh the impacts on both innovation and artistic industries. As AI abilities grow, a balanced way forward is needed that respects all stakeholders.

Source: https://www.cryptopolitan.com/google-ai-scraping-to-be-deemed-fair-use/