@mekkaokereke
Thanks for that long response, you really brought up a good concern that I wasn't thinking about, but I've read about from past work in the field.
Obviously, any implementation we have needs to keep this in mind, and I will note, current systems which flag slurs already end up flagging posts by black and queer people using the N word and the F word respectively, which is not the kind of thing we are looking to catch with this. I'm well aware of the issue of these AI systems seeing different styles of communication and deciding differently based on that.
That said, this feels like a stronger argument against letting it run unsupervised rather than using it as a flagging system in general: if we all know it's an automated system, that it's fallible, and that it's point is to make sure we see things to look over them, and not to tell us they're bad, in theory we should be providing fair moderation.
(Continued)
In practice, requiring human oversight of automated decision making doesn't correct for bias or errors -- people tend to defer to the automated system. Ben Green's excellent paper on this focuses on government use of automated systems, but the dynamic applies more generally. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3921216
And sure, as you point out, mistakes are made today by human moderators ... but those mistakes contaminate any training set. And algorithms typically magnify biases in the underlying data.@Raccoon@techhub.social @mekkaokereke@hachyderm.io
In practice, requiring human oversight of automated decision making doesn't correct for bias or errors -- people tend to defer to the automated system. Ben Green's excellent paper on this focuses on government use of automated systems, but the dynamic applies more generally. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3921216