Why AI Safety Fails in Multi-Turn Conversations — And What This Means for Governance

In this article, I analyse, from the governance perspective, a fine-grained evaluation benchmark called SafeDialBench for LLMs in multi-turn dialogues evaluated by Chinese researchers. The paper was recently discussed in the AI governance reading by BlueDot We all now use LLM chat boxes for almost everything these days, but we just don’t use them the way we used […]

Read More