OpenAI claims The New York Times tricked ChatGPT into copying its articles

GlitzyArmrest@lemmy.world · edit-2 1 year ago

OpenAI claims The New York Times tricked ChatGPT into copying its articles

SheeEttin@programming.dev · 1 year ago

The problem is not that it’s regurgitating. The problem is that it was trained on NYT articles and other data in violation of copyright law. Regurgitation is just evidence of that.

blargerer@kbin.social · 1 year ago

Its not clear that training on copyrighted material is in breach of copyright. It is clear that regurgitating copyrighted material is in breach of copyright.

000@fuck.markets · edit-2 1 year ago

There hasn’t been a court ruling in the US that makes training a model on copyrighted data any sort of violation. Regurgitating exact content is a clear copyright violation, but simply using the original content/media in a model has not been ruled a breach of copyright (yet).

SheeEttin@programming.dev · 1 year ago

True. I fully expect that the court will rule against OpenAI here, because it very obviously does not meet any fair use exemption.

520@kbin.social · edit-2 1 year ago

For that to work, NYT has to prove OpenAI is copying their words verbatim, not just their style.

If the AI isn’t outputting a string of words that can be found on an NYT article, they don’t stand a chance

kromem@lemmy.world · 1 year ago

Tell me you haven’t actually read legal opinions on the subject without telling me…

SheeEttin@programming.dev · 1 year ago

I’m not aware of any federal case law on copyright and AI. Happy to read some if you have a suggestion.

kromem@lemmy.world · 1 year ago

Case law hasn’t been defined yet, but lawyers who have litigated copyright or worked at the office have written on the topic:

https://www.eff.org/deeplinks/2023/04/how-we-think-about-copyright-and-ai-art-0

https://docs.house.gov/meetings/JU/JU03/20230517/115951/HHRG-118-JU03-Wstate-DamleS-20230517.pdf

regbin_@lemmy.world · 1 year ago

Training on copyrighted data should be allowed as long as it’s something publicly posted.

assassin_aragorn@lemmy.world · 1 year ago

Only if the end result of that training is also something public. OpenAI shouldn’t be making money on anything except ads if they’re using copyright material without paying for it.

themusicman@lemmy.world · 1 year ago

I was trained on copyrighted material… I guess I should work for free

ricecake@sh.itjust.works · 1 year ago

Why an exception for ads if you’re going that route? Wouldn’t advertisers deserve the same protections as other creatives?

Personally, since they’re not making copies of the input (beyond what’s transiently required for processing), and they’re not distributing copies, I’m not sure why copyright would come into play.

lolcatnip@reddthat.com · 1 year ago

deleted by creator