Fun

Google’s new Gemini AI model dominates benchmarks, beats GPT-4o and Claude-3

News Feed - 2024-08-02 06:08:03

Tristan Greene2 hours agoGoogle’s new Gemini AI model dominates benchmarks, beats GPT-4o and Claude-3This is the first time Google’s taken the top slot on the Chatbot Arena leaderboard.531 Total viewsListen to article 0:00NewsOwn this piece of crypto historyCollect this article as NFTCOINTELEGRAPH IN YOUR SOCIAL FEEDFollow ourSubscribe onThere’s a new top dog in the world of generative artificial intelligence benchmarks and its name is Gemini 1.5 Pro. 


The previous champ, OpenAI’s ChatGPT-4o, was finally surpassed on Aug. 1 when Google quietly launched an experimental release of its latest model.


Gemini’s latest update arrived without fanfare and is currently labelled as experimental. But it quickly gained the attention of the AI community across social media as reports began to trickle in that it was surpassing its rivals on benchmark scores.Artificial intelligence benchmarks


OpenAI’s ChatGPT has been the standard bearer for generative AI since the launch of GPT-3. Its latest model, GPT-4o, and its closest competitor, Anthropic’s Claude-3, have reigned supreme above most other models in most common benchmarks for the past year or so with little in the way of competition.Source:Large Model Systems Organization.


One of the most popular benchmarks is called the LMSYS Chatbot Arena. It tests models on a variety of tasks and assigns an overall competency score. GPT-4o received a score of 1,286 while Claude-3 earned a respectable 1,271.


A previous version of Gemini 1.5 Pro scored 1,261. But the experimental version (Gemini 1.5 Pro 0801) released on Aug 1 scored a whopping 1,300.


This indicates that it’s overall more capable than its competitors, but benchmarks aren’t necessarily an accurate representation of what an AI model can and can’t do.Community excitement


Without deeper comparisons available, we’re entering an era where the AI chatbot market has matured enough to offer multiple options. It’s ultimately up to end-users to determine which AI model works best for them.


Anecdotally, there’s been a wave of excitement over the latest version of Gemini with users on social media calling it “insanely good.” One Redditor went so far as to write that it “blows 4o out of the water.”


It’s unclear at this time if the experimental version of Gemini 1.5 Pro will end up being the default going forward. While it remains generally available as of the time of this article’s publication, the fact that it’s in what"s considered an early release or testing phase indicates that it’s possible the model could be rescinded or changed for safety or alignment reasons.


Related:Google announces safety, transparency advancements in AI models# Google# Technology# AI# ChatGPT# OpenAIAdd reaction

News Feed

Chinese State Media Surprises With Forecast of Bitcoin Outshining Gold
Chinese State Media Surprises With Forecast of Bitcoin Outshining Gold Chinese state-operated media outlets keep adding cryptocurrencies — specifically bit
Technical Indicators Suggest Bitcoin May Reach $120,000 In Q2, Says Standard Chartered
Reason to trust Strict editorial policy that focuses on accuracy, relevance, and impartiality Created by industry experts and meticulously reviewed The highest standards in reporting and pu
Illegal to Own Gold? Hedge Fund Manager Warns Governments May Ban Gold Ownership
Illegal to Own Gold? Hedge Fund Manager Warns Governments May Ban Gold OwnershipA well-known hedge fund manager has warned that governments may ban private gold ownership. He explai
CFTC Chairman Confirms Ether Cryptocurrency Is a Commodity
CFTC Chairman Heath Tarbert has said ether, the world’s second-largest cryptocurrency by market capitalization, is a commodity. Speaking at the Yahoo! Finance All Markets Summ
Bitcoin Miner Greenidge Seeks to Raise $22.8 Million in Class A Common Stock Proposal
Bitcoin Miner Greenidge Seeks to Raise $22.8 Million in Class A Common Stock Proposal The bitcoin mining operation Greenidge Generation is seeking to raise roughly $22.8 million, a
Biggest Movers: SOL Slips Over 10%, as RUNE, WAVES Down Nearly 20%
Biggest Movers: SOL Slips Over 10%, as RUNE, WAVES Down Nearly 20% Global crypto markets were trading nearly 6% lower as of writing this, with SOL one of the biggest crypto’
Federal Reserve lists CBDCs as one of 7 ‘key duties’ to Congress
Brayden Lindrea8 hours agoFederal Reserve lists CBDCs as one of 7 ‘key duties’ to Congress“If you don’t think the Fed is pursuing a CBDC, think again,” said U.S. Representative Tom Emmer.7274 Total views4 Total
Bitcoin.com Exchange Announces Listing of New Digital Asset BUY by Burency
Bitcoin.com Exchange Announces Listing of New Digital Asset BUY by BurencyBitcoin.com Exchange is thrilled to announce the upcoming listing of a new digital asset on the 20th of Aug
Bitfinex Executives Deny Allegations of Issuing USDT to Pump BTC – “Tether Backed by Cash Assets and a Loan”
Bitfinex Executives Deny Allegations of Issuing USDT to Pump BTC – "Tether Backed by Cash Assets and a Loan" Bitfinex general counsel Stuart Hoegner has dismi
Bitcoin price recovery to $62.5K could trigger breakout in TON, AVAX, KAS and XMR
Rakesh Upadhyay4 hours agoBitcoin price recovery to $62.5K could trigger breakout in TON, AVAX, KAS and XMRBitcoin’s recent weakness has pulled several altcoins lower, but TON, AVAX, KAS and XMR look set to move higher
Onchain Analysis Report Says Terra’s Bitcoin Reserves Were Sent to Binance and Gemini
Onchain Analysis Report Says Terra"s Bitcoin Reserves Were Sent to Binance and Gemini After the collapse of Terra’s once-stable coin terrausd (UST), a number of people wonde
FTX CEO Updates Crypto Community, Sunsets Alameda Trading, Addresses a Specific ‘Sparring Partner’
FTX CEO Updates Crypto Community, Sunsets Alameda Trading, Addresses a Specific "Sparring Partner" On Nov. 10, 2022, FTX CEO Sam Bankman-Fried (SBF) addressed the crypto community