Fun

News Feed - 2023-11-21 02:11:44

Tristan Greene6 hours agoScientists develop AI monitoring agent to detect and stop harmful outputsThe monitoring system is designed to detect and thwart both prompt injection attacks and edge-case threats.2877 Total views10 Total sharesListen to article 0:00NewsJoin us on social networksA team of researchers from artificial intelligence (AI) firm AutoGPT, Northeastern University and Microsoft Research have developed a tool that monitors large language models (LLMs) for potentially harmful outputs and prevents them from executing. 


The agent is described in a preprint research paper titled “Testing Language Model Agents Safely in the Wild.” According to the research, the agent is flexible enough to monitor existing LLMs and can stop harmful outputs, such as code attacks, before they happen.


Per the research:“Agent actions are audited by a context-sensitive monitor that enforces a stringent safety boundary to stop an unsafe test, with suspect behavior ranked and logged to be examined by humans.”


The team writes that existing tools for monitoring LLM outputs for harmful interactions seemingly work well in laboratory settings, but when applied to testing models already in production on the open internet, they “often fall short of capturing the dynamic intricacies of the real world.”


This, seemingly, is because of the existence of edge cases. Despite the best efforts of the most talented computer scientists, the idea that researchers can imagine every possible harm vector before it happens is largely considered an impossibility in the field of AI.


Even when the humans interacting with AI have the best intentions, unexpected harm can arise from seemingly innocuous prompts.An illustration of the monitor in action. On the left, a workflow ending in a high safety rating. On the right, a workflow ending in a low safety rating. Source: Naihin, et., al. 2023


To train the monitoring agent, the researchers built a data set of nearly 2,000 safe human-AI interactions across 29 different tasks ranging from simple text-retrieval tasks and coding corrections all the way to developing entire webpages from scratch.


Related:Meta dissolves responsible AI division amid restructuring


They also created a competing testing data set filled with manually created adversarial outputs, including dozens intentionally designed to be unsafe.


The data sets were then used to train an agent on OpenAI’s GPT 3.5 turbo, a state-of-the-art system, capable of distinguishing between innocuous and potentially harmful outputs with an accuracy factor of nearly 90%.# Microsoft# AI# ChatGPTAdd reactionAdd reactionRead moreHow blockchain, AI can help research into extending human lifeScammers play a long game using bogus, AI-backed "law firm"Google to invest another $2B in AI firm Anthropic: Report

News Feed

Report: FTX CEO Told Staff Crypto Exchange ‘Effectively Paused’ Withdrawals
Report: FTX CEO Told Staff Crypto Exchange "Effectively Paused" Withdrawals According to a report shared by an FTX staff member, prior to Binance announcing it would acquire the ex
10 Countries Actively Regulating Cryptocurrency Despite Global Crisis
10 Countries Actively Regulating Cryptocurrency Despite Global Crisis Despite the coronavirus pandemic and the resulting financial crisis, at least 10 countries have made announc
William Suberg16 hours agoBitcoin ‘full breakout’ not here yet as BTC price spends month at $30KBTC price targets extend down to $27,000 with Bitcoin in a state of “flux.”2847 Total views33 Total sharesListen to
Crypto Exchange FTX Buys Blockfolio for $150 Million in Effort To Expand Retail Presence
Crypto Exchange FTX Buys Blockfolio for $150 Million in Effort To Expand Retail PresenceCryptocurrency derivatives exchange FTX has acquired the popular portfolio tracking app Block
Analyst: 1,500 Bitcoins Lost Every Day, Less Than 14 Million Coins Will Ever Circulate
Analyst: 1,500 Bitcoins Lost Every Day, Less Than 14 Million Coins Will Ever CirculateA cryptocurrency analyst, Timothy Peterson claims that 1,500 bitcoins are lost each day meaning
Bitcoin, Ethereum Technical Analysis: BTC, ETH Start Week Lower as Bearish Sentiment Returns to Markets
Bitcoin, Ethereum Technical Analysis: BTC, ETH Start Week Lower as Bearish Sentiment Returns to Markets Bitcoin fell to a six-day low to start the week, as the token moved closer t
Binance Licensed in Kazakhstan as Provider of Crypto Exchange and Custody Services
Binance Licensed in Kazakhstan as Provider of Crypto Exchange and Custody Services Authorities in Kazakhstan have granted Binance a license to operate as a digital asset platform a
How the music industry is battling AI deepfakes one state at a time with the ELVIS Act
Savannah Fortis10 hours agoHow the music industry is battling AI deepfakes one state at a time with the ELVIS ActIn an in-depth interview, Recording Academy advocacy and public policy chief officer Todd Dupler explains h
Cardano Price To $6? Historical Data Says ADA Is Ready To Take Off
Este artículo también está disponible en español. With the bearish pressure in the cryptocurrency market dissipating, large-cap altcoins are showing good signs of recover
Ezra Reguerra10 hours agoVitalik Buterin shares account abstraction challenges in Ethereum: EthCCVitalik Buterin explained how an account abstraction extension called “paymasters” can allow users to pay for gas fees
How Penguin Karts Will Drive The Blockchain Gaming Scene Forward
How Penguin Karts Will Drive The Blockchain Gaming Scene Forward sponsored A lot has happened since the idea of Penguin Karts was first conceived. Who would have thought that a nost
Latest EU Sanctions Expected to Stimulate Russia’s Own Crypto Market, Exchanges Maintain Services
Latest EU Sanctions Expected to Stimulate Russia’s Own Crypto Market, Exchanges Maintain Services The new crypto sanctions imposed by the European Union are likely to spur the de