In testing, the technique helped Claude block 95% of jailbreak attempts. But the process still needs more ‘real-world’ red-teaming.
This article has been indexed from Latest stories for ZDNET in Security
In testing, the technique helped Claude block 95% of jailbreak attempts. But the process still needs more ‘real-world’ red-teaming.