Unrelenting, persistent attacks on frontier models make them fail, with the patterns of failure varying by model and developer. Red teaming shows that it’s not the sophisticated, complex attacks that ...
In case you missed it, OpenAI yesterday debuted a powerful new feature for ChatGPT and with it, a host of new security risks and ramifications. Called the "ChatGPT agent," this new feature is an ...
Last month, at the 33rd annual DEF CON, the world’s largest hacker convention, in Las Vegas, Anthropic researcher Keane Lucas took the stage. A former U.S. Air Force captain with a PhD in electrical ...