Fun

Anthropic launches $15K jailbreak bounty program for its unreleased next-gen AI

News Feed - 2024-08-10 06:08:17

Tristan Greene2 hours agoAnthropic launches $15K jailbreak bounty program for its unreleased next-gen AIThe program will be open to a limited number of participants initially but will expand at a later date.404 Total viewsListen to article 0:00NewsOwn this piece of crypto historyCollect this article as NFTCOINTELEGRAPH IN YOUR SOCIAL FEEDFollow ourSubscribe onArtificial intelligence firm Anthropic announced the launch of an expanded bug bounty program on Aug.8, with rewards as high as $15,000 for participants who can “jailbreak” the company’s unreleased, “next generation” AI model. 


Anthropic’s flagship AI model, Claude-3, is a generative AI system similar to OpenAI’s ChatGPT and Google’s Gemini. As part of the company’s efforts to ensure that Claude and its other models are capable of operating safely, it conducts what’s called “red teaming.”Red teaming


Red teaming is basically just trying to break something on purpose. In Claude’s case, the point of red teaming is to try and figure out all of the ways that it could be prompted, forced, or otherwise perturbed into generating unwanted outputs.


During red teaming efforts, engineers might rephrase questions or reframe a query in order to trick the AI into outputting information it’s been programmed to avoid.


For example, an AI system trained on data gathered from the internet is likely to contain personally identifiable information on numerous people. As part of its safety policy, Anthropic has put guardrails in place to prevent Claude and its other models from outputting that information.


As AI models become more robust and capable of imitating human communication, the task of trying to figure out every possible unwanted output becomes exponentially challenging.Bug bounty


Anthropic has implemented several novel safety interventions in its models, including its “Constitutional AI” paradigm, but it’s always nice to get fresh eyes on a long-standing issue.


According to a company blog post, it’s latest initiative will expand on existing bug bounty programs to focus on universal jailbreak attacks:“These are exploits that could allow consistent bypassing of AI safety guardrails across a wide range of areas. By targeting universal jailbreaks, we aim to address some of the most significant vulnerabilities in critical, high-risk domains such as CBRN (chemical, biological, radiological, and nuclear) and cybersecurity.”


The company is only accepting a limited number of participants and encourages AI researchers with experience and those who “have demonstrated expertise in identifying jailbreaks in language models” to apply by Friday, Aug. 16.


Not everyone who applies will be selected, but the company plans to “expand this initiative more broadly in the future.”


Those who are selected will receive early access to an unreleased “next generation” AI model for red-teaming purposes.


Related:Tech firms pen letter to EU requesting more time to comply with AI Act# Technology# AIAdd reaction

News Feed

Tristan Greene6 hours agoGoogle Cloud is now a validator on the Polygon networkAccording to Polygon, “the same infrastructure used to power YouTube and Gmail” will help secure its network.8277 Total views39 Total sha
Market Strategist Michael Wilkerson Believes US Inflation Could Rise to 12% by Year-End Despite Predictions of Decrease
Market Strategist Michael Wilkerson Believes US Inflation Could Rise to 12% by Year-End Despite Predictions of Decrease While several market strategists and analysts expect U.S. in
Ethereum MVRV Ratio Nears 160-Day MA Crossover – Accumulation Trend Ahead?
Reason to trust Strict editorial policy that focuses on accuracy, relevance, and impartiality Created by industry experts and meticulously reviewed The highest standards in reporting and pu
Jesse Coghlan2 hours agoSam Bankman-Fried prosecutors submit proposed jury instructions for trialAhead of FTX co-founder Sam Bankman-Fried’s October trial, government prosecutors have laid out their requests for how th
Samurai, Get Ready to Slash Yōkai Souls Again in Nioh 2 Open Beta
Critical smash-hit Nioh"s sequel has an open beta coming in November. | Source: PlayStationFresh off releasing a new trailer at the Tokyo Game Show, developer Team Ninja has announc
Helen Partz11 hours agoCrypto VC firm Spartan Capital invests in Pendle to drive DeFi growthAfter actively supporting Pendle since its launch in 2021, Spartan Capital has made a follow-on investment in Pendle Finance via
Sega Hints at the Inclusion of NFT and Metaverse Elements in Its ‘Super Game’ Proposal
Sega Hints at the Inclusion of NFT and Metaverse Elements in Its "Super Game" Proposal Sega, the renowned game development company based in Japan, has hinted at the use of NFT (non
B2BinPay: How to Find a Reliable Provider of Bitcoin Payments
B2BinPay: How to Find a Reliable Provider of Bitcoin Payments press release PRESS RELEASE. Bitcoin is not just the first cryptocurrency to have come onto the scene – the role
Brayden Lindrea9 hours agoAI can be used in ‘every single process’ of JPMorgan’s operations, says CEOJPMorgan’s CEO Jamie Dimon pointed to trading, hedging, research and error detection as some of the processes t
Bitcoin halving will spur ‘surge’ in NFT volume
Ezra Reguerra12 hours agoBitcoin halving will spur ‘surge’ in NFT volumeMintable founder Zach Burks said that if the halving boosts user engagement, there would be an uptrend in NFT prices.2908 Total views39 Total sh
Bitwage Raises $1.5 Million in Latest Funding Round
Bitwage Raises $1.5 Million in Latest Funding Round Bitwage, the pioneer cryptocurrency-based payroll and invoicing company, has raised $1.5 million in its latest funding round. Th
MicroStrategy Q1 net loss hits $53.1M but Bitcoin buying spree continues
Brayden Lindrea7 minutes agoMicroStrategy Q1 net loss hits $53.1M but Bitcoin buying spree continuesMicroStrategy has yet to adopt the new accounting standard that would have taken the billions of dollars in paper gains