Email or username:

Password:

Forgot your password?
Top-level
mort

@feoh @glyph I seriously doubt that it only trains on the code already in the repo… these LLM style networks take an absolute ton of training data.

In fact, their privacy page explicitly says that it *does not* use your code for training.

It seems exactly identical to Copilot in terms of copyright ramifications.

3 comments
Stephan

@mort @feoh @glyph how it is usually done is: the model is trained on a bunch of publicly available and/or private data, and later fine-tuned based on your own data to yield more relevant results for your own use case. I doubt that anyone's own code output is enough to train a good large language model.

Gerbrand van Dieyen

@durchaus @mort @feoh @glyph according to their website "Trained exclusively on permissive open-source repositories"
You can optionally adapted with your own code base, where they promise the code won't be exposed.

I must say does seem useful and legit tabnine.com/

mort

@gerbrand @durchaus @feoh @glyph As was pointed out already (mastodon.gamedev.place/@Doomed), “permissively licensed” doesn’t mean public domain. Permissive licenses still have terms, such as the requirement to include a copyright notice.

Go Up