Anthropic Prompt Library

Anthropic unveils new framework to block harmful content from AI models

Detecting and blocking jailbreak tactics has long been challenging, making this advancement particularly valuable for ...

ExtremeTech8d

Anthropic: We Dare You to Break Our New AI Chatbot

Anthropic has been running a bug bounty of $15,000 since late last year, tasking hackers and prompt engineers to try to jailbreak the latest version of Claude, and despite all their efforts ...

BGR8d

Anthropic dares you to try to jailbreak Claude AI

These prompts were then used to train the AI to recognize when a prompt was harmful and when it was not. After the first successful experiment, Anthropic used Claude to create a more resource ...

MIT Technology Review8d

Anthropic has a new way to protect large language models against jailbreaks

AI firm Anthropic has developed a new line of defense against a common kind of attack called a jailbreak. A jailbreak tricks ...

Security9d

Anthropic has a new way to protect large language models against jailbreaks

Anthropic has developed a barrier that stops ... while others play with the formatting of a prompt, such as using nonstandard capitalization or replacing certain letters with numbers.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results