Cybersecurity researchers are calling attention to a new jailbreaking method called Echo Chamber that could be leveraged to trick popular large language models (LLMs) into generating undesirable responses, irrespective of the safeguards put in place.
“Unlike traditional jailbreaks that rely on adversarial phrasing or character obfuscation, Echo Chamber weaponizes indirect references, semantic
This article has been indexed from The Hacker News