Fun

News Feed - 2023-11-21 02:11:44

Tristan Greene6 hours agoScientists develop AI monitoring agent to detect and stop harmful outputsThe monitoring system is designed to detect and thwart both prompt injection attacks and edge-case threats.2877 Total views10 Total sharesListen to article 0:00NewsJoin us on social networksA team of researchers from artificial intelligence (AI) firm AutoGPT, Northeastern University and Microsoft Research have developed a tool that monitors large language models (LLMs) for potentially harmful outputs and prevents them from executing. 


The agent is described in a preprint research paper titled “Testing Language Model Agents Safely in the Wild.” According to the research, the agent is flexible enough to monitor existing LLMs and can stop harmful outputs, such as code attacks, before they happen.


Per the research:“Agent actions are audited by a context-sensitive monitor that enforces a stringent safety boundary to stop an unsafe test, with suspect behavior ranked and logged to be examined by humans.”


The team writes that existing tools for monitoring LLM outputs for harmful interactions seemingly work well in laboratory settings, but when applied to testing models already in production on the open internet, they “often fall short of capturing the dynamic intricacies of the real world.”


This, seemingly, is because of the existence of edge cases. Despite the best efforts of the most talented computer scientists, the idea that researchers can imagine every possible harm vector before it happens is largely considered an impossibility in the field of AI.


Even when the humans interacting with AI have the best intentions, unexpected harm can arise from seemingly innocuous prompts.An illustration of the monitor in action. On the left, a workflow ending in a high safety rating. On the right, a workflow ending in a low safety rating. Source: Naihin, et., al. 2023


To train the monitoring agent, the researchers built a data set of nearly 2,000 safe human-AI interactions across 29 different tasks ranging from simple text-retrieval tasks and coding corrections all the way to developing entire webpages from scratch.


Related:Meta dissolves responsible AI division amid restructuring


They also created a competing testing data set filled with manually created adversarial outputs, including dozens intentionally designed to be unsafe.


The data sets were then used to train an agent on OpenAI’s GPT 3.5 turbo, a state-of-the-art system, capable of distinguishing between innocuous and potentially harmful outputs with an accuracy factor of nearly 90%.# Microsoft# AI# ChatGPTAdd reactionAdd reactionRead moreHow blockchain, AI can help research into extending human lifeScammers play a long game using bogus, AI-backed "law firm"Google to invest another $2B in AI firm Anthropic: Report

News Feed

Turner Wright11 hours agoSam Bankman-Fried’s lawyers push for temporary release, object to prosecutors’ proposed dealThe U.S. Justice Department has produced “millions of pages of documents” related to SBF’s cr
Fed Chair Jerome Powell Could ‘Slow Crypto Down’ in His Second Term, Warns Billionaire Mike Novogratz
Fed Chair Jerome Powell Could "Slow Crypto Down" in His Second Term, Warns Billionaire Mike Novogratz Billionaire investor Mike Novogratz has cautioned that Federal Reserve Chairma
TCG World partners with Chooky Records to bring Busta Rhymes, others to the metaverse
Tristan Greene4 hours agoTCG World partners with Chooky Records to bring Busta Rhymes, others to the metaverseThe legendary hip hop artist who once rapped on A Tribe Called Quest’s seminal hit “Scenario” is now in
XRP Leading A $400 Trillion Revolution? How Ripple’s Tokenization Campaign Is Sparking Utility
Reason to trust Strict editorial policy that focuses on accuracy, relevance, and impartiality Created by industry experts and meticulously reviewed The highest standards in reporting and pu
Vitalik Buterin Has Suggestions for Dogecoin and Doge’s Cooperation With Ethereum
Vitalik Buterin Has Suggestions for Dogecoin and Doge"s Cooperation With Ethereum Ethereum co-founder Vitalik Buterin has some suggestions of what he personally would like to see h
Bitcoin On The Brink As Trump Tariffs Shatter Bond Market
Reason to trust Strict editorial policy that focuses on accuracy, relevance, and impartiality Created by industry experts and meticulously reviewed The highest standards in reporting and pu
USDC Stablecoin Depegging Causes Concern Among Crypto Advocates, 5 Other Stablecoins Slip Below Parity
USDC Stablecoin Depegging Causes Concern Among Crypto Advocates, 5 Other Stablecoins Slip Below Parity On Saturday, March 11, 2023, crypto advocates are concerned as a few stableco
BlackRock, Grayscale have to wait for SEC’s spot ETH ETF decisions
Derek Andersen2 hours agoBlackRock, Grayscale have to wait for SEC’s spot ETH ETF decisionsAmendments are slowing down the SEC in both cases, although observers say the agency is not really engaged with them.672 Total
Bitcoin Enters Oversold Levels, Analyst Warns This Is Bearish, Not Bullish
Reason to trust Strict editorial policy that focuses on accuracy, relevance, and impartiality Created by industry experts and meticulously reviewed The highest standards in reporting and pu
Report: Turkey’s September Gold Imports up by 1,700% as Individuals Swap Falling Lira With the Precious Metal
Report: Turkey"s September Gold Imports up by 1,700% as Individuals Swap Falling Lira With the Precious Metal Turkey’s gold imports of just over 39,000 kilograms in September
Largest NFT Market by Volume Opensea Launches Smartphone Application
Largest NFT Market by Volume Opensea Launches Smartphone Application On September 17, the largest non-fungible token (NFT) marketplace in terms of volume, Opensea, announced the la
Amaka Nwaokocha8 minutes agoTurkey aims to shed FATF gray list status with new crypto regulationsFinance Minister Simsek reportedly stated that the sole outstanding technical compliance matter is crypto assets.37 Total v