Fun

Google’s new Gemini AI model dominates benchmarks, beats GPT-4o and Claude-3

News Feed - 2024-08-02 06:08:03

Tristan Greene2 hours agoGoogle’s new Gemini AI model dominates benchmarks, beats GPT-4o and Claude-3This is the first time Google’s taken the top slot on the Chatbot Arena leaderboard.531 Total viewsListen to article 0:00NewsOwn this piece of crypto historyCollect this article as NFTCOINTELEGRAPH IN YOUR SOCIAL FEEDFollow ourSubscribe onThere’s a new top dog in the world of generative artificial intelligence benchmarks and its name is Gemini 1.5 Pro. 


The previous champ, OpenAI’s ChatGPT-4o, was finally surpassed on Aug. 1 when Google quietly launched an experimental release of its latest model.


Gemini’s latest update arrived without fanfare and is currently labelled as experimental. But it quickly gained the attention of the AI community across social media as reports began to trickle in that it was surpassing its rivals on benchmark scores.Artificial intelligence benchmarks


OpenAI’s ChatGPT has been the standard bearer for generative AI since the launch of GPT-3. Its latest model, GPT-4o, and its closest competitor, Anthropic’s Claude-3, have reigned supreme above most other models in most common benchmarks for the past year or so with little in the way of competition.Source:Large Model Systems Organization.


One of the most popular benchmarks is called the LMSYS Chatbot Arena. It tests models on a variety of tasks and assigns an overall competency score. GPT-4o received a score of 1,286 while Claude-3 earned a respectable 1,271.


A previous version of Gemini 1.5 Pro scored 1,261. But the experimental version (Gemini 1.5 Pro 0801) released on Aug 1 scored a whopping 1,300.


This indicates that it’s overall more capable than its competitors, but benchmarks aren’t necessarily an accurate representation of what an AI model can and can’t do.Community excitement


Without deeper comparisons available, we’re entering an era where the AI chatbot market has matured enough to offer multiple options. It’s ultimately up to end-users to determine which AI model works best for them.


Anecdotally, there’s been a wave of excitement over the latest version of Gemini with users on social media calling it “insanely good.” One Redditor went so far as to write that it “blows 4o out of the water.”


It’s unclear at this time if the experimental version of Gemini 1.5 Pro will end up being the default going forward. While it remains generally available as of the time of this article’s publication, the fact that it’s in what"s considered an early release or testing phase indicates that it’s possible the model could be rescinded or changed for safety or alignment reasons.


Related:Google announces safety, transparency advancements in AI models# Google# Technology# AI# ChatGPT# OpenAIAdd reaction

News Feed

The Size of Bitcoin’s Distributed Ledger Nears a Half Terabyte
The Size of Bitcoin’s Distributed Ledger Nears a Half Terabyte Well over a decade ago, on January 3, 2009, the size of the Bitcoin blockchain was 0.285 kilobytes (kB) or around 2
Turner Wright4 hours agoMultiple spot crypto ETF applications go to Federal Register in step toward SEC approvalPublishing the ETF applications in the official journal of the U.S. government gives the SEC up to 240 days
Tom Mitchelhill3 hours agoSui Network launches Google, Twitch and Facebook logins for DAppsSui joins the growing ranks of Web3 firms looking to onboard users who are all too often “irretrievably lost” at the doorstep
PODCAST: Caitlin Long on Bitcoin as Insurance Against Financial Collapse
“To me, it’s an insurance against instability in the mainstream financial industry,” said Caitlin Long, one of the most experienced Wall Street professionals to defect to the crypto space.
Bitcoin large sellers 'exhausted' as $67K price holds
Ciaran Lyons3 hours agoBitcoin large sellers "exhausted" as $67K price holdsBitcoin is seeing a reduction in selling pressure from large investors as its price continues to hold above $67,000.3734 Total views12 Total sha
Saudi Chemicals Producer SABIC Launches Blockchain Pilot Project
Saudi Chemicals Producer SABIC Launches Blockchain Pilot Project A Saudi Arabia-based chemical manufacturer has said its recently launched blockchain pilot is expected to uncover t
Samurai, Get Ready to Slash Yōkai Souls Again in Nioh 2 Open Beta
Critical smash-hit Nioh"s sequel has an open beta coming in November. | Source: PlayStationFresh off releasing a new trailer at the Tokyo Game Show, developer Team Ninja has announc
How Penguin Karts Will Drive The Blockchain Gaming Scene Forward
How Penguin Karts Will Drive The Blockchain Gaming Scene Forward sponsored A lot has happened since the idea of Penguin Karts was first conceived. Who would have thought that a nost
Turner Wright8 hours agoParliamentary committee calls for shutdown of Worldcoin in KenyaThe committee’s recommendations included having the Kenyan government consider implementing a comprehensive framework for digital
Texas Blockchain Council endorses Ted Cruz for US Senate
Turner Wright1 hour agoTexas Blockchain Council endorses Ted Cruz for US SenateThe junior senator from Texas has spoken at several crypto conferences about Bitcoin mining and its potential benefits for the Lone Star Stat
Dogecoin Foundation Says It’s Working With Ethereum’s Vitalik Buterin on a Staking Concept
Dogecoin Foundation Says It"s Working With Ethereum"s Vitalik Buterin on a Staking Concept According to the Dogecoin Foundation, the organization is working with Vitalik Buterin on
Amaka Nwaokocha2 hours agoCathie Wood bullish on Bitcoin and AI convergenceThe ARK Invest CEO shares her views on the intersection of Bitcoin and artificial intelligence, highlighting its economic implications.1882 Total