Nature: Al generates covertly racist decisions about people based on their dialect(www.nature.com)

posted 4 months ago

antifuchs@awful.systems

techtakes@awful.systems

6 commentshide report

Got the pointer to this from Allison Parrish who says it better than I could:

it’s a very compelling paper, with a super clever methodology, and (i’m paraphrasing/extrapolating) shows that “alignment” strategies like RLHF only work to ensure that it never seems like a white person is saying something overtly racist, rather than addressing the actual prejudice baked into the model.

Sort:

Hot Top Controversial New Old

You are viewing a single thread.

View all comments

[ - ]