"Microsoft Battles Large Language Model Jailbreak with New AI Tools"
In recent news, Microsoft discussed the newly detected ‘Crescendo’ Large Language Model (LLM) jailbreak method and its ongoing measures to improve LLM safety. Worryingly, this method can tamper with AI chatbots like OpenAI’s ChatGPT, Google’s Gemini, Meta’s LlaMA, and Anthropic’s Claude, forcing them to produce uncensored content.
Crescendo method involves strategically sequencing non-threatening prompts to mislead the chatbot into providing disallowed information. Incredibly, the attack can usually be completed within ten interaction turns. Disturbingly, some versions have a 100% success rate on tested models.
Microsoft researchers unveiled that when Crescendo is automated using ‘Crescendomation’, an LLM generating and refining the jailbreak prompts, it convinced GPT 3.5, GPT-4, Gemini-Pro, and LLaMA-2 70b to generate misleading election content and indecorous rants.
Microsoft’s newly implemented tools ‘AI Watchdog’ and ‘AI Spotlight’ are helping to combat Crescendo. AI Watchdog uses a separate LLM to prevent adversarial content, thereby reducing the efficacy of Crescendo attacks. Simultaneously, AI Spotlight separates user prompts from other content to eliminate potential malicious intentions.
With AI expanding its roles and risks, safety has become paramount. Have you ever been a victim of a chatbot's miscalculated responses? Let us know!
Don't forget to check out Murf AI (www.TheBestAI.org/offer), the revolutionary platform that uses real people's voices to create realistic AI voices.
#Microsoft #AI #Chatbots #Security