@glyph Generally I look suspiciously at strongly-worded hot takes like this, but I do agree with the underlying point.
Top-level
@glyph Generally I look suspiciously at strongly-worded hot takes like this, but I do agree with the underlying point. 7 comments
@diazona But also it's not a particularly hot take. Copilot reproduces its training data verbatim on a somewhat regular basis. https://hackernoon.com/legal-issues-surrounding-copilots-use-of-training-data @diazona In some sense it is not even a novel policy. If you were caught stealing code from another open source project, or from your employer's proprietary codebase, you'd be banned in the same way. Using an extremely slow probabilistic token generator to file off the copyright notices is not meaningfully different than using a text editor to do the same thing. @glyph I speak as a former technical expert witness before the Copyright Board of Canada during the Internet copyright wars. My advice to clients is stay far away from this technology until litigation has settled its copyright status. You can't afford to be the test case. Or the loser of the test case. @glyph Well true, I suppose I didn't really mean "hot take" in the sense of controversial. I was thinking more of a strongly worded opinion and couldn't come up with the right word in the moment. |
@diazona Given that the whole industry is sleepwalking into an open sewer with the way that both LLM output and input are being treated both legally and ethically, a certain level of stridency is required to emphasize the seriousness of the point.