@feoh @glyph I seriously doubt that it only trains...

@feoh @glyph I seriously doubt that it only trains on the code already in the repo… these LLM style networks take an absolute ton of training data.

In fact, their privacy page explicitly says that it *does not* use your code for training.

It seems exactly identical to Copilot in terms of copyright ramifications.

Like 10 Nov 2023 at 0:22 | Wall-to-wall | Open on fosstodon.org

3 comments

Stephan

@mort @feoh @glyph how it is usually done is: the model is trained on a bunch of publicly available and/or private data, and later fine-tuned based on your own data to yield more relevant results for your own use case. I doubt that anyone's own code output is enough to train a good large language model.

10 Nov 2023 at 3:34 | Open on mastodon.social

Gerbrand van Dieyen

@durchaus @mort @feoh @glyph according to their website "Trained exclusively on permissive open-source repositories"
You can optionally adapted with your own code base, where they promise the code won't be exposed.

I must say does seem useful and legit https://www.tabnine.com/

10 Nov 2023 at 8:01 | Open on fosstodon.org

mort

@gerbrand @durchaus @feoh @glyph As was pointed out already (https://mastodon.gamedev.place/@Doomed_Daniel/111383817293869390), “permissively licensed” doesn’t mean public domain. Permissive licenses still have terms, such as the requirement to include a copyright notice.

10 Nov 2023 at 8:05 | Open on fosstodon.org

Go Up