Email or username:

Password:

Forgot your password?
Simon Willison

Blogged a few thoughts on the OSI's latest draft of a definition for "Open Source AI", which notably doesn't require that the training data itself be released under on open source license: simonwillison.net/2024/Aug/27/

5 comments
Jan Lehnardt :couchdb:

@simon pragmatism or dangerous precedent, we’ll find out :)

Loren Kohnfelder

@simon Even if the training data cannot be shared it can be named or described: for "open" to have any meaning I'd like to see a declaration, even if it's 100% "dark" training data.

Simon Willison

@lmk OSI call that “data information” - they call for: “Sufficiently detailed information about the data used to train the system, so that a skilled person can recreate a substantially equivalent system using the same or similar data. “

Loren Kohnfelder

@simon Thanks for those details, that definition is more specific than I imagined. Curious if "can" implies actual access to the data. To me, saying "you could recreate it if you had access to the data which you don't" would be against the "open" spirit.

Simon Willison

@lmk sadly I am pretty sure that is what they mean - the version where you have no guaranteed access to the training date at all, other than an optimistic hope that you could assemble it yourself given enough of a description

Go Up