Fun

Anthropic launches $15K jailbreak bounty program for its unreleased next-gen AI

News Feed - 2024-08-10 06:08:17

Tristan Greene2 hours agoAnthropic launches $15K jailbreak bounty program for its unreleased next-gen AIThe program will be open to a limited number of participants initially but will expand at a later date.404 Total viewsListen to article 0:00NewsOwn this piece of crypto historyCollect this article as NFTCOINTELEGRAPH IN YOUR SOCIAL FEEDFollow ourSubscribe onArtificial intelligence firm Anthropic announced the launch of an expanded bug bounty program on Aug.8, with rewards as high as $15,000 for participants who can “jailbreak” the company’s unreleased, “next generation” AI model. 


Anthropic’s flagship AI model, Claude-3, is a generative AI system similar to OpenAI’s ChatGPT and Google’s Gemini. As part of the company’s efforts to ensure that Claude and its other models are capable of operating safely, it conducts what’s called “red teaming.”Red teaming


Red teaming is basically just trying to break something on purpose. In Claude’s case, the point of red teaming is to try and figure out all of the ways that it could be prompted, forced, or otherwise perturbed into generating unwanted outputs.


During red teaming efforts, engineers might rephrase questions or reframe a query in order to trick the AI into outputting information it’s been programmed to avoid.


For example, an AI system trained on data gathered from the internet is likely to contain personally identifiable information on numerous people. As part of its safety policy, Anthropic has put guardrails in place to prevent Claude and its other models from outputting that information.


As AI models become more robust and capable of imitating human communication, the task of trying to figure out every possible unwanted output becomes exponentially challenging.Bug bounty


Anthropic has implemented several novel safety interventions in its models, including its “Constitutional AI” paradigm, but it’s always nice to get fresh eyes on a long-standing issue.


According to a company blog post, it’s latest initiative will expand on existing bug bounty programs to focus on universal jailbreak attacks:“These are exploits that could allow consistent bypassing of AI safety guardrails across a wide range of areas. By targeting universal jailbreaks, we aim to address some of the most significant vulnerabilities in critical, high-risk domains such as CBRN (chemical, biological, radiological, and nuclear) and cybersecurity.”


The company is only accepting a limited number of participants and encourages AI researchers with experience and those who “have demonstrated expertise in identifying jailbreaks in language models” to apply by Friday, Aug. 16.


Not everyone who applies will be selected, but the company plans to “expand this initiative more broadly in the future.”


Those who are selected will receive early access to an unreleased “next generation” AI model for red-teaming purposes.


Related:Tech firms pen letter to EU requesting more time to comply with AI Act# Technology# AIAdd reaction

News Feed

Prashant Jha11 hours agoCrypto community begins Bitcoin halving countdown as milestone date nearsThe CEO of Binance and BTC analysts are among those who have started to draw attention to the halving event as the crypto c
Meet CoinDesk Next Week in Tokyo
CoinDesk will be in Tokyo and we’re excited to meet our readers on Monday, October 14 at 6 p.m. CoinDesk Japan will be hosting this informal event.
YearnSwap Is All Set to Introduce Its Decentralized Ecosystem
YearnSwap Is All Set to Introduce Its Decentralized Ecosystem LONDON, United Kingdom, — YearnSwap.org– Is all set to launch its Decentralized protoco
The U.S. Stock Market’s ‘Fear Index’ Is Flashing an Eerie Warning
U.S. stocks are approaching record highs again, but Wall Street"s fear index suggests now is not the time to be complacent. | Image: AP Photo/Richard DrewThe CBOE Volatility Index,
APENFT Marketplace Launches Testnet With an Exciting Developer Sprint
APENFT Marketplace Launches Testnet With an Exciting Developer Sprint press release PRESS RELEASE.Singapore, Singapore / Mar 31, 2022 / –APENFT Marketplace Testnet goes live
Huobi-Backed SocialFi Platform, Torum to Make Social Metaverses a Reality With Avatar NFTs
Huobi-Backed SocialFi Platform, Torum to Make Social Metaverses a Reality With Avatar NFTs press release PRESS RELEASE. Torum is set to launch a pioneering social metaverse and soci
Technical Indicators Suggest Bitcoin May Reach $120,000 In Q2, Says Standard Chartered
Reason to trust Strict editorial policy that focuses on accuracy, relevance, and impartiality Created by industry experts and meticulously reviewed The highest standards in reporting and pu
For 38 Consecutive Days Ethereum Gas Fees Record the Lowest Rates Since 2020
For 38 Consecutive Days Ethereum Gas Fees Record the Lowest Rates Since 2020 In 43 days, the Ethereum network could finally see a full transition from proof-of-work (PoW) to proof-
Will Bitcoin Enter Its Most Massive Bull Cycle? This Engineer Thinks So
Este artículo también está disponible en español. Although Bitcoin is having a rough moment this week, with prices oscillating between $93k and $96k, at least one popular
US Senators Introduce ‘Lawful Access to Encrypted Data Act’ — With Backdoor Mandate
US Senators Introduce "Lawful Access to Encrypted Data Act" — With Backdoor MandateUS lawmakers have introduced the Lawful Access to Encrypted Data Act to ensure law enforcement c
Fold Launches AR Game With Bitcoin Rewards, Firm Partners With Niantic to Forge a BTC Metaverse
Fold Launches AR Game With Bitcoin Rewards, Firm Partners With Niantic to Forge a BTC Metaverse Best known for creating the augmented reality (AR) mobile games Ingress and Poké
Ciaran Lyons6 hours agoFTX court filing reveals former Alameda CEO’s $2.5M yacht purchaseThe payment to the American Yacht Group was disclosed under the category of payments benefiting any insider within one year befor