Email or username:

Password:

Forgot your password?
Top-level
Paul Cantrell

@james It’s an approach called “supervised learning:”

en.wikipedia.org/wiki/Supervis

It can be totally valid. The trick (well, the first one, and after that I’m out of my depth) is that you can’t evaluate the results against the training data, so you train the system on only X% of your tagged data, then check how well it matches the desired output for the remaining Y% it hasn’t “seen” before.

4 comments
Aaron

@james @inthehands I have seen some efforts to identify, quantify, and mitigate bias in the human-generated labels, if that's what you're getting at. I would say, yes, there will *always* be bias in manually tagged data. The question is, do the biases present in that data affect the job you want the model to do? Often the only source of truth for whether a task has been performed correctly is human judgment. In those cases, we can identify secondary biases (like gender or race in hiring decisions) that we want to specifically mitigate, but what we are training the model to learn is literally a bias itself, e.g. the bias towards candidates that hiring managers think will do well in the position.

@james @inthehands I have seen some efforts to identify, quantify, and mitigate bias in the human-generated labels, if that's what you're getting at. I would say, yes, there will *always* be bias in manually tagged data. The question is, do the biases present in that data affect the job you want the model to do? Often the only source of truth for whether a task has been performed correctly is human judgment. In those cases, we can identify secondary biases (like gender or race in hiring decisions)...

Dawn Ahukanna

@inthehands @james
Observations:
1. There are not enough (disposable) developers to churn out code, so use “1-shot-imprint statistical engine”[1SISE] to generate all the code we want, how hard can it be?
2. There are not enough (disposable) data scientists & ML engineers to supervise “imprinting”, so use the entire internet for “1SISE”. Job done, right?
3. There are not enough (disposable) “natural resources” to power the “1SISE”. Oops!
4. “1SISE” only has 1 “biased” perspective, quel surprise!

Jex

@dahukanna @inthehands @james

With all those cases, it seems like 1SISE fits all.

Go Up