When someone tells me they're going to use ML for moderation, or for flagging toxic posts, I ask which model they're going to use, and what info the model is going to do inference on.
If the input doesn't include the relationship between the two people, and the community that it is being said in, then it is impossible to not get many false positives. There is not enough context to do reliable inference based on just a short text sample.
https://hachyderm.io/@mekkaokereke/109989027419424661
3/N
@Raccoon
So no, I don't like ML for moderation.
I could like it in theory, but in practice I rarely see implementations that:
a) include enough context
b) do not amplify the very problem that is experienced by the most vulnerable users
4/4