Red Teaming for GenAI Harms – Revealing the Risks and Rewards for Online Safety – Ofcom

‘As the new regulator for online safety, Ofcom is exploring how online services could employ safety measures to protect their users from harm posed by GenAI. One such safety intervention is red teaming, a type of evaluation method that seeks to find vulnerabilities in AI models. Put simply, this involves ‘attacking’ a model to see if it can generate harmful content. The red team can then seek to fix those vulnerabilities by introducing new and additional safeguards, for example, filters that can block such content. ‘

Link: https://www.ofcom.org.uk/online-safety/illegal-and-harmful-content/red-teaming-for-genai-harms/