Fun

Google’s new Gemini AI model dominates benchmarks, beats GPT-4o and Claude-3

News Feed - 2024-08-02 06:08:03

Tristan Greene2 hours agoGoogle’s new Gemini AI model dominates benchmarks, beats GPT-4o and Claude-3This is the first time Google’s taken the top slot on the Chatbot Arena leaderboard.531 Total viewsListen to article 0:00NewsOwn this piece of crypto historyCollect this article as NFTCOINTELEGRAPH IN YOUR SOCIAL FEEDFollow ourSubscribe onThere’s a new top dog in the world of generative artificial intelligence benchmarks and its name is Gemini 1.5 Pro. 


The previous champ, OpenAI’s ChatGPT-4o, was finally surpassed on Aug. 1 when Google quietly launched an experimental release of its latest model.


Gemini’s latest update arrived without fanfare and is currently labelled as experimental. But it quickly gained the attention of the AI community across social media as reports began to trickle in that it was surpassing its rivals on benchmark scores.Artificial intelligence benchmarks


OpenAI’s ChatGPT has been the standard bearer for generative AI since the launch of GPT-3. Its latest model, GPT-4o, and its closest competitor, Anthropic’s Claude-3, have reigned supreme above most other models in most common benchmarks for the past year or so with little in the way of competition.Source:Large Model Systems Organization.


One of the most popular benchmarks is called the LMSYS Chatbot Arena. It tests models on a variety of tasks and assigns an overall competency score. GPT-4o received a score of 1,286 while Claude-3 earned a respectable 1,271.


A previous version of Gemini 1.5 Pro scored 1,261. But the experimental version (Gemini 1.5 Pro 0801) released on Aug 1 scored a whopping 1,300.


This indicates that it’s overall more capable than its competitors, but benchmarks aren’t necessarily an accurate representation of what an AI model can and can’t do.Community excitement


Without deeper comparisons available, we’re entering an era where the AI chatbot market has matured enough to offer multiple options. It’s ultimately up to end-users to determine which AI model works best for them.


Anecdotally, there’s been a wave of excitement over the latest version of Gemini with users on social media calling it “insanely good.” One Redditor went so far as to write that it “blows 4o out of the water.”


It’s unclear at this time if the experimental version of Gemini 1.5 Pro will end up being the default going forward. While it remains generally available as of the time of this article’s publication, the fact that it’s in what"s considered an early release or testing phase indicates that it’s possible the model could be rescinded or changed for safety or alignment reasons.


Related:Google announces safety, transparency advancements in AI models# Google# Technology# AI# ChatGPT# OpenAIAdd reaction

News Feed

Study: Amid Mining Bans, China Still Commands World’s Second-Largest Share of Bitcoin Hashrate
Study: Amid Mining Bans, China Still Commands World"s Second-Largest Share of Bitcoin Hashrate New data stemming from the latest Cambridge Centre for Alternative Finance (CCAF) rep
Tom Mitchelhill4 hours agoCan PEPE make a comeback? Traders, analysts and Pepe maxis weigh inCointelegraph also spoke to developers purportedly behind a new PEPE token spin-off, who claim the new one is everything “the
Failed House candidate and partner of ex-FTX exec starts crypto think tank
Turner Wright7 hours agoFailed House candidate and partner of ex-FTX exec starts crypto think tankContributions to Michelle Bond’s 2022 congressional campaign were part of a criminal investigation into her partner, Rya
Bitcoin Attempt To Dip Below $96K ‘Led To Nothing’ – Analyst Expects $100K Soon
Este artículo también está disponible en español. Bitcoin has been on a remarkable upward trajectory, pushing above the $96,000 mark for several days after consolidating
While the Bear Market’s Claws Drag ETH Prices Down, Ethereum Network Fees Remain Low
While the Bear Market"s Claws Drag ETH Prices Down, Ethereum Network Fees Remain Low While ethereum prices jumped 61% higher during the last 30 days, the crypto asset’s U.S.
While Stocks Rebound, Analysts Discuss Bitcoin’s Decoupling, Gold Markets Remain ‘Under Pressure’
While Stocks Rebound, Analysts Discuss Bitcoin"s Decoupling, Gold Markets Remain "Under Pressure" U.S. equities markets jumped on Thursday as stock traders saw some relief after a
3air Solves Africa’s Massive Internet Access Problem With Cardano-Based ISP Platform
3air Solves Africa’s Massive Internet Access Problem With Cardano-Based ISP Platform press release PRESS RELEASE. 3air is using an ISP platform built on Cardano to provide interne
Bitcoin ETF investors buy the dip: Daily inflows hit $295M
Tom Mitchelhill3 hours agoBitcoin ETF investors buy the dip: Daily inflows hit $295MUnited States-listed spot Bitcoin ETFs have notched their biggest day of inflows in over a month amid a slump in the crypto markets.1480
Tom Mitchelhill3 hours agoShibarium hits 1M wallets amid meteoric growth, SHIB yet to catch upShibarium network activity has soared despite the price of SHIB falling more than 20% since the tumultuous launch of the layer
Eurovision Song Contest 2022 Winners Release NFT for Ukraine Charity Auction
Eurovision Song Contest 2022 Winners Release NFT for Ukraine Charity Auction Kalush Orchestra, the winners of the Eurovision Song Contest 2022, are auctioning off an NFT for charit
Ana Paula Pereira5 hours agoFinancial privacy and regulation can co-exist with ZK proofs — Vitalik ButerinA new paper co-authored by Ethereum’s Vitalik Buterin highlights the use of zero-knowledge proofs as a tool fo
Brayden Lindrea7 hours agoCrypto payment gateway CoinsPaid suspects Lazarus Group in $37M hackCoinsPaid said it is now working with Estonian law enforcement and several blockchain security firms are assisting to minimize