Fun

Anthropic launches $15K jailbreak bounty program for its unreleased next-gen AI

News Feed - 2024-08-10 06:08:17

Tristan Greene2 hours agoAnthropic launches $15K jailbreak bounty program for its unreleased next-gen AIThe program will be open to a limited number of participants initially but will expand at a later date.404 Total viewsListen to article 0:00NewsOwn this piece of crypto historyCollect this article as NFTCOINTELEGRAPH IN YOUR SOCIAL FEEDFollow ourSubscribe onArtificial intelligence firm Anthropic announced the launch of an expanded bug bounty program on Aug.8, with rewards as high as $15,000 for participants who can “jailbreak” the company’s unreleased, “next generation” AI model. 


Anthropic’s flagship AI model, Claude-3, is a generative AI system similar to OpenAI’s ChatGPT and Google’s Gemini. As part of the company’s efforts to ensure that Claude and its other models are capable of operating safely, it conducts what’s called “red teaming.”Red teaming


Red teaming is basically just trying to break something on purpose. In Claude’s case, the point of red teaming is to try and figure out all of the ways that it could be prompted, forced, or otherwise perturbed into generating unwanted outputs.


During red teaming efforts, engineers might rephrase questions or reframe a query in order to trick the AI into outputting information it’s been programmed to avoid.


For example, an AI system trained on data gathered from the internet is likely to contain personally identifiable information on numerous people. As part of its safety policy, Anthropic has put guardrails in place to prevent Claude and its other models from outputting that information.


As AI models become more robust and capable of imitating human communication, the task of trying to figure out every possible unwanted output becomes exponentially challenging.Bug bounty


Anthropic has implemented several novel safety interventions in its models, including its “Constitutional AI” paradigm, but it’s always nice to get fresh eyes on a long-standing issue.


According to a company blog post, it’s latest initiative will expand on existing bug bounty programs to focus on universal jailbreak attacks:“These are exploits that could allow consistent bypassing of AI safety guardrails across a wide range of areas. By targeting universal jailbreaks, we aim to address some of the most significant vulnerabilities in critical, high-risk domains such as CBRN (chemical, biological, radiological, and nuclear) and cybersecurity.”


The company is only accepting a limited number of participants and encourages AI researchers with experience and those who “have demonstrated expertise in identifying jailbreaks in language models” to apply by Friday, Aug. 16.


Not everyone who applies will be selected, but the company plans to “expand this initiative more broadly in the future.”


Those who are selected will receive early access to an unreleased “next generation” AI model for red-teaming purposes.


Related:Tech firms pen letter to EU requesting more time to comply with AI Act# Technology# AIAdd reaction

News Feed

Dogecoin Primed For A Surge? Analysts Highlight Key Breakout Signs
Reason to trust Strict editorial policy that focuses on accuracy, relevance, and impartiality Created by industry experts and meticulously reviewed The highest standards in reporting and pu
Magic Eden overtakes Blur, Yuga Labs offloads NFT games: Nifty Newsletter
Ezra Reguerra5 hours agoMagic Eden overtakes Blur, Yuga Labs offloads NFT games: Nifty NewsletterNFT platform Magic Eden recorded an NFT trading volume of $756.5 million in March, surpassing its rival Blur.2150 Total vie
SHIB Token Sees Rise Ahead of Shibarium Beta Launch
SHIB Token Sees Rise Ahead of Shibarium Beta Launch The meme token shiba inu has seen a rise against the U.S. dollar during the last seven days as the coin’s supporters antic
William Suberg9 hours agoBitcoin price taps $29.3K as data shows ‘most resilient’ US jobs marketBitcoin offers slight volatility after fresh U.S. macro data, but BTC price behavior remains firmly rangebound.1649 Tota
The Rabbids Invade the Metaverse as The Sandbox Partners With Gaming Giant Ubisoft
The Rabbids Invade the Metaverse as The Sandbox Partners With Gaming Giant Ubisoft On February 8, the subsidiary of Animoca Brands and the decentralized gaming virtual world projec
Bitcoin's $66.9K price holds strong, casts doubts on a 'deep correction'
Ciaran Lyons6 hours agoBitcoin"s $66.9K price holds strong, casts doubts on a "deep correction"Despite ongoing speculation about Bitcoin"s next potential "deep correction," some cryptocurrency analysts disagree
Helen Partz13 hours agoChatGPT creator OpenAI builds new team to check AI risksChatGPT creator OpenAI is taking seriously the full spectrum of safety risks related to AI and launching its “Preparedness” team as plann
Turner Wright5 hours agoProsecutors will still consider Sam Bankman-Fried’s alleged campaign finance scheme at trialRather than reinstating a count on campaign finance violations to SBF’s indictment, the Justice Depa
XRP Gears Up For Major Move — Chart Signals Are Clear
Reason to trust Strict editorial policy that focuses on accuracy, relevance, and impartiality Created by industry experts and meticulously reviewed The highest standards in reporting and pu
XRP Price Retraces Gains From Sunday Rally, This Important Support Level Could Be The Defining Factor
Este artículo también está disponible en español. The XRP price has retraced and lost the gains it recorded from its Sunday rally following Trump’s announcement that th
China’s Digital Yuan Wallet App Now in iOS and Android App Stores
China"s Digital Yuan Wallet App Now in iOS and Android App Stores China’s central bank has released its digital currency wallet app ahead of the Winter Olympics. The wallet
Analyst Says XRP Price Is Yet To Hit Its First Bearish Target – Details
Reason to trust Strict editorial policy that focuses on accuracy, relevance, and impartiality Created by industry experts and meticulously reviewed The highest standards in reporting and pu