cross-posted from: https://programming.dev/post/8121669
Japan determines copyright doesn’t apply to LLM/ML training data.
On a global scale, Japan’s move adds a twist to the regulation debate. Current discussions have focused on a “rogue nation” scenario where a less developed country might disregard a global framework to gain an advantage. But with Japan, we see a different dynamic. The world’s third-largest economy is saying it won’t hinder AI research and development. Plus, it’s prepared to leverage this new technology to compete directly with the West.
I am going to live in the sea.
www.biia.com/japan-goes-all-in-copyright-doesnt-apply-to-ai-training/
I think the “learning” process could be similar, but the issue is the scale.
No human artist could integrate the amount of material at the speed that these systems can. The systems are also by definition nothing but derivative. I think the process is similar, but there is important nuance that supports a different conclusion.
I don’t think the scale matters. Do we treat human artists that only read 10 books or watched 10 movies before creating something different than the ones that consumed 1000 or 10000? For me the issue of who controls it is much more important. If something like ChatGPT were truly open-source and people could use it any way they want, I would have zero issues with the models being trained on everything that is available. We desperately need less copyright instead of more. Right now I think going after the big AI models with copyright is a double-edged sword. It’s good to bring them down, but not at the cost of strengthening copyright.