Email or username:

Password:

Forgot your password?
Top-level
mekka okereke :verified:

@Raccoon

This is the kind of thing that sounds like a good idea to people that don't talk to enough Black people in tech. ๐Ÿคท๐Ÿฟโ€โ™‚๏ธ

The paradox of almost every ML based moderation system in existence:

* Black women receive the most abuse online
* ML systems disproportionately false positive statements by Black women, and disproportionately false negative abuse against Black women

Similarly, facial recognition systems most used against Black folk, get the most false positives on Black folk. ๐Ÿคท๐Ÿฟโ€โ™‚๏ธ

1/N

8 comments
mekka okereke :verified:

@Raccoon

I posted this after the Perspective toxicity API was first released.

Other gems from the initial launch:

"Police don't kill too many Black kids."
Score: Not toxic. ๐Ÿคฆ๐Ÿฟโ€โ™‚๏ธ

"Police kill too many Black kids.
Score: 80.28% toxic. ๐Ÿ˜ฎ

"I'll never vote for Bernie Sanders until he apologizes to Black women."
Score: 71.43% toxic. ๐Ÿคฆ๐Ÿฟโ€โ™‚๏ธ

"South Carolina voters are low information people."
Score: Not toxic ๐Ÿ˜ฎ

"Elizabeth Warren is a snake."
Score: Not toxic ๐Ÿ˜ฎ

2/N

@Raccoon

I posted this after the Perspective toxicity API was first released.

Other gems from the initial launch:

"Police don't kill too many Black kids."
Score: Not toxic. ๐Ÿคฆ๐Ÿฟโ€โ™‚๏ธ

"Police kill too many Black kids.
Score: 80.28% toxic. ๐Ÿ˜ฎ

"I'll never vote for Bernie Sanders until he apologizes to Black women."
Score: 71.43% toxic. ๐Ÿคฆ๐Ÿฟโ€โ™‚๏ธ

mekka okereke :verified:

@Raccoon

When someone tells me they're going to use ML for moderation, or for flagging toxic posts, I ask which model they're going to use, and what info the model is going to do inference on.

If the input doesn't include the relationship between the two people, and the community that it is being said in, then it is impossible to not get many false positives. There is not enough context to do reliable inference based on just a short text sample.

hachyderm.io/@mekkaokereke/109

3/N

@Raccoon

When someone tells me they're going to use ML for moderation, or for flagging toxic posts, I ask which model they're going to use, and what info the model is going to do inference on.

If the input doesn't include the relationship between the two people, and the community that it is being said in, then it is impossible to not get many false positives. There is not enough context to do reliable inference based on just a short text sample.

mekka okereke :verified:

@Raccoon

So no, I don't like ML for moderation.

I could like it in theory, but in practice I rarely see implementations that:
a) include enough context
b) do not amplify the very problem that is experienced by the most vulnerable users

4/4

Raccoon at TechHub for Harris

@mekkaokereke
Thanks for that long response, you really brought up a good concern that I wasn't thinking about, but I've read about from past work in the field.

Obviously, any implementation we have needs to keep this in mind, and I will note, current systems which flag slurs already end up flagging posts by black and queer people using the N word and the F word respectively, which is not the kind of thing we are looking to catch with this. I'm well aware of the issue of these AI systems seeing different styles of communication and deciding differently based on that.

That said, this feels like a stronger argument against letting it run unsupervised rather than using it as a flagging system in general: if we all know it's an automated system, that it's fallible, and that it's point is to make sure we see things to look over them, and not to tell us they're bad, in theory we should be providing fair moderation.

(Continued)

@mekkaokereke
Thanks for that long response, you really brought up a good concern that I wasn't thinking about, but I've read about from past work in the field.

Obviously, any implementation we have needs to keep this in mind, and I will note, current systems which flag slurs already end up flagging posts by black and queer people using the N word and the F word respectively, which is not the kind of thing we are looking to catch with this. I'm well aware of the issue of these AI systems seeing different...

Jon

In practice, requiring human oversight of automated decision making doesn't correct for bias or errors -- people tend to defer to the automated system. Ben Green's excellent paper on this focuses on government use of automated systems, but the dynamic applies more generally. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3921216

First, evidence suggests that people are unable to perform the desired oversight functions. Second, as a result of the first flaw, human oversight policies legitimize government uses of faulty and controversial algorithms without addressing the fundamental issues with these tools.
And sure, as you point out, mistakes are made today by human moderators ... but those mistakes contaminate any training set. And algorithms typically magnify biases in the underlying data.

@Raccoon@techhub.social @mekkaokereke@hachyderm.io

In practice, requiring human oversight of automated decision making doesn't correct for bias or errors -- people tend to defer to the automated system. Ben Green's excellent paper on this focuses on government use of automated systems, but the dynamic applies more generally. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3921216

Raccoon at TechHub for Harris

@jdp23 @mekkaokereke
Oh no, if I at any point suggested that I thought that an AI can be a better moderator than a human then I have written it poorly. No machine should ever be responsible for a management decision because a machine can't be held accountable.

Humans are definitely the better choice for moderation decisions.

This is a good point about the oversight problem though: with a system that just flags certain words or combinations thereof, it's easy for people to understand, internally, that these posts might not be bad. With a system that's doing some complicated thing that we don't understand beneath the surface, it's going to be a bit harder to make that connection.

And once again, this is a case of the system not really justifying itself: how much will it actually catch that isn't caught by simpler systems, and does that outweigh the real potential for poor oversight of a system with bad biases?

@jdp23 @mekkaokereke
Oh no, if I at any point suggested that I thought that an AI can be a better moderator than a human then I have written it poorly. No machine should ever be responsible for a management decision because a machine can't be held accountable.

Humans are definitely the better choice for moderation decisions.

Jon

Agreed that simpler tools that are easier for people to understand the limits of might be less prone to the oversight problems. I talked once with an r/AskHistorians moderator about how tools fit into their intersectional moderation approach, and they told me that they used some very simple pattern-matching tools to improve efficiency ... stuff like that can be quite useful, if everybody understands the limitations and processes make sure there isn't too much reliance on the tools.

But that's a strong argument against *AI-based* systems!

Of course, a different way to look at it is that there's an opportunity to start from scratch, build a good training set and algorithms on top of it that focus on explainability and being used as a tool to help moderators (rather than a magic bullet). There are some great AI researchers and content moderation experts here who really do understand the issues and limitations of today's systems. But, it's a research project, not something that's deployable today.

@Raccoon@techhub.social @mekkaokereke@hachyderm.io

Agreed that simpler tools that are easier for people to understand the limits of might be less prone to the oversight problems. I talked once with an r/AskHistorians moderator about how tools fit into their intersectional moderation approach, and they told me that they used some very simple pattern-matching tools to improve efficiency ... stuff like that can be quite useful, if everybody understands the limitations and processes make sure there isn't too much reliance on the tools.

But that's a strong...

Jon

Also, related to your question of how much AI-based moderation would actually help, there's an important point in the "Moderation: Key Observations" section of the Governance on Fediverse Microblogging Servers that @darius@friend.camp and @kissane@mas.to just published:

A lot of Fediverse moderation work is relatively trivial for experienced server teams. This includes dealing with spam, obvious rulebreaking (trolls, hate servers), and reports that arenโ€™t by or about people actually on a given server. For some kinds of servers and for certain higher-profile or high-intensity members on other kinds of servers, moderators also receive a high volume of reports about member behaviors (like nudity or frank discussion of heated topics) that their server either explicitly or implicitly allows, and which the moderators therefore close without actioning.

These kinds of reports are the cleanest targets for tooling upgrades and shared/coalitional moderation, but itโ€™s also worth noting that except in special circumstances (like a spam wave or a sudden reduction in available moderators), this is not usually the part of moderation work that produces intense stress for the teams we interviewed. (This is one of the findings that we believe does not necessarily generalize across other small and medium-sized servers.)


@Raccoon@techhub.social @mekkaokereke@hachyderm.io

Also, related to your question of how much AI-based moderation would actually help, there's an important point in the "Moderation: Key Observations" section of the Governance on Fediverse Microblogging Servers that @darius@friend.camp and @kissane@mas.to just published:

A lot of Fediverse moderation work is relatively trivial for experienced server teams. This includes dealing with spam, obvious rulebreaking (trolls, hate servers), and reports that arenโ€™t by or about people actually on a given server....
Go Up