Fun

News Feed - 2023-10-25 04:10:00

Tristan Greene5 hours agoHumans and AI often prefer sycophantic chatbot answers to the truth — StudyThe team at Anthropic AI found that five “state-of-the-art” language models exhibit sycophancy, indicating the problem could be ubiquitous.1216 Total views17 Total sharesListen to article 0:00NewsJoin us on social networksArtificial intelligence (AI) large language models (LLMs) built on one of the most common learning paradigms have a tendency to tell people what they want to hear instead of generating outputs containing the truth, according to a study from Anthropic. 


In one of the first studies to delve this deeply into the psychology of LLMs, researchers at Anthropic have determined that both humans and AI prefer so-called sycophantic responses over truthful outputs at least some of the time.


Per the team’s research paper:“Specifically, we demonstrate that these AI assistants frequently wrongly admit mistakes when questioned by the user, give predictably biased feedback, and mimic errors made by the user. The consistency of these empirical findings suggests sycophancy may indeed be a property of the way RLHF models are trained.”


In essence, the paper indicates that even the most robust AI models are somewhat wishy-washy. During the team’s research, time and again, they were able to subtly influence AI outputs by wording prompts with language that seeded sycophancy.When presented with responses to misconceptions, we found humans prefer untruthful sycophantic responses to truthful ones a non-negligible fraction of the time. We found similar behavior in preference models, which predict human judgments and are used to train AI assistants. pic.twitter.com/fdFhidmVLh— Anthropic (@AnthropicAI) October 23, 2023


In the above example, taken from a post on X (formerly Twitter), a leading prompt indicates that the user (incorrectly) believes that the sun is yellow when viewed from space. Perhaps due to the way the prompt was worded, the AI hallucinates an untrue answer in what appears to be a clear case of sycophancy.


Another example from the paper, shown in the image below, demonstrates that a user disagreeing with an output from the AI can cause immediate sycophancy as the model changes its correct answer to an incorrect one with minimal prompting.Examples of sycophantic answers in response to human feedback. Source: Sharma, et. al., 2023.


Ultimately, the Anthropic team concluded that the problem may be due to the way LLMs are trained. Because they use data sets full of information of varying accuracy — eg., social media and internet forum posts — alignment often comes through a technique called “reinforcement learning from human feedback” (RLHF).


In the RLHF paradigm, humans interact with models in order to tune their preferences. This is useful, for example, when dialing in how a machine responds to prompts that could solicit potentially harmful outputs such as personally identifiable information or dangerous misinformation.


Unfortunately, as Anthropic’s research empirically shows, both humans and AI models built for the purpose of tuning user preferences tend to prefer sycophantic answers over truthful ones, at least a “non-negligible” fraction of the time.


Currently, there doesn’t appear to be an antidote for this problem. Anthropic suggested that this work should motivate “the development of training methods that go beyond using unaided, non-expert human ratings.” 


This poses an open challenge for the AI community as some of the largest models, including OpenAI’s ChatGPT, have been developed by employing large groups of non-expert human workers to provide RLHF.# Science# AIAdd reactionAdd reactionRead moreAI a powerful tool for devs to change gaming, says former Google gaming headBlockchain companies are creating AI chatbots to help developersGreece establishes AI advisory committee to create national strategy

News Feed

Savannah Fortis1 hour agoGoogle requests dismissal of AI data scraping class-action suitGoogle argued in its motion to dismiss the claims that using publicly available information shared on the internet is not “stealin
$1.2B In Ethereum Withdrawn From CEXs – Strong Accumulation Signal
Reason to trust Strict editorial policy that focuses on accuracy, relevance, and impartiality Created by industry experts and meticulously reviewed The highest standards in reporting and pu
Chinese ‘Crypto Dad’ faces government investigation
Helen Partz13 hours agoChinese ‘Crypto Dad’ faces government investigationYao Qian, the first director of CBDC development at China’s central bank, is reportedly under investigation for suspected law violations.188
Charles Hoskinson rejects Vitalik Buterin's view on pro-crypto voting
Amaka Nwaokocha26 minutes agoCharles Hoskinson rejects Vitalik Buterin"s view on pro-crypto votingThis clash of views comes as the 2024 U.S. presidential election approaches, with significant implications for the future
India demands $86M from Binance in unpaid GST taxes
Arijit Sarkar48 minutes agoIndia demands $86M from Binance in unpaid GST taxesBinance faces a demand for nearly $86 million in unpaid GST from Indian authorities, aiming to resume operations after a previous ban.433 Tota
FlatQube DEX Unveils QUBE Utility Token and Sets DAO Governance in Motion
FlatQube DEX Unveils QUBE Utility Token and Sets DAO Governance in Motion sponsored Over the past six months, the Everscale network has gone through a complete metamorphosis. Origin
Skybridge Bitcoin Fund Debuts With $310 Million — Investment Manager Expects Each BTC Worth $535K
Skybridge Bitcoin Fund Debuts With $310 Million — Investment Manager Expects Each BTC Worth $535K Skybridge Bitcoin Fund’s assets under management have r
Turner Wright4 hours agoCocoa broker testifies to Bitcoin holdings on FTX in Sam Bankman-Fried trialLondon-based cocoa broker Marc-Antoine Julliard responded to questions from prosecutors on crypto he held on FTX in Nove
Lotte subsidiary partners with Aptos to build Web3 hub
Arijit Sarkar9 hours agoLotte subsidiary partners with Aptos to build Web3 hubThe partnership is aimed at creating a Web3 hub for Lotte, a South Korean multinational conglomerate corporation that oversees roughly 90 busi
Grayscale’s ETH ETF outflows could subside this week — Analyst
Tom Mitchelhill3 hours agoGrayscale’s ETH ETF outflows could subside this week — AnalystEther ETFs posted a net outflow of $98 million on July 29, marking the fourth consecutive day of bleeding — but analysts predi
Prashant Jha10 hours agoBinance sold USDC for another stablecoin — Coinbase CEOBinance has been experimenting with relatively lesser-known stablecoins after a pause on BUSD.8377 Total views17 Total sharesListen to arti
Report: Bitcoin Bull Run and Rising Awareness of Digital Currencies Led to a 20% Increase in Crypto Related Lawsuits in China
Report: Bitcoin Bull Run and Rising Awareness of Digital Currencies Led to a 20% Increase in Crypto Related Lawsuits in China A report from China says as many as