
AI now joins the ranks of those who do not entirely comprehend poetry. Research conducted by Italy’s Icaro Lab revealed that poetry can be leveraged to jailbreak AI and circumvent safety protocols. In their investigation, researchers developed 20 prompts that included poetic vignettes in both Italian and English, concluding with requests to create harmful content. These prompts were evaluated on 25 Large Language Models from firms such as Google, OpenAI, and Meta. The poetic prompts frequently succeeded.
“Poetic framing reached an average jailbreak success rate of 62% for custom-crafted poems and approximately 43% for meta-prompt conversions, surpassing non-poetic benchmarks and uncovering a systematic vulnerability across different model families and safety training methodologies,” the study indicates. “These results demonstrate that mere stylistic variation can sidestep existing safety mechanisms, highlighting fundamental deficiencies in alignment methods and evaluation standards.”
The success of jailbreaking varied among LLMs. OpenAI’s GPT-5 nano did not yield harmful content, whereas Google’s Gemini 2.5 pro did so consistently. The researchers inferred that these outcomes expose a notable gap in benchmark safety assessments and regulatory initiatives like the EU AI Act.
“Our findings indicate that even a minor stylistic alteration can significantly decrease refusal rates, suggesting that benchmark-only evidence may exaggerate real-world resilience,” the paper remarked.
Great poetry transcends the literal, whereas LLMs are excessively literal. This study evokes the experience of listening to Leonard Cohen’s “Alexandra Leaving,” inspired by C.P. Cavafy’s “The God Abandons Antony.” It addresses themes of loss and heartbreak, but a literal interpretation would overlook its core essence—something LLMs are prone to doing.
Disclosure: Ziff Davis, the parent company of Mashable, initiated a lawsuit against OpenAI in April, alleging that it violated Ziff Davis copyrights in the training and operation of its AI systems.