Fun

Google’s new Gemini AI model dominates benchmarks, beats GPT-4o and Claude-3

News Feed - 2024-08-02 06:08:03

Tristan Greene2 hours agoGoogle’s new Gemini AI model dominates benchmarks, beats GPT-4o and Claude-3This is the first time Google’s taken the top slot on the Chatbot Arena leaderboard.531 Total viewsListen to article 0:00NewsOwn this piece of crypto historyCollect this article as NFTCOINTELEGRAPH IN YOUR SOCIAL FEEDFollow ourSubscribe onThere’s a new top dog in the world of generative artificial intelligence benchmarks and its name is Gemini 1.5 Pro. 


The previous champ, OpenAI’s ChatGPT-4o, was finally surpassed on Aug. 1 when Google quietly launched an experimental release of its latest model.


Gemini’s latest update arrived without fanfare and is currently labelled as experimental. But it quickly gained the attention of the AI community across social media as reports began to trickle in that it was surpassing its rivals on benchmark scores.Artificial intelligence benchmarks


OpenAI’s ChatGPT has been the standard bearer for generative AI since the launch of GPT-3. Its latest model, GPT-4o, and its closest competitor, Anthropic’s Claude-3, have reigned supreme above most other models in most common benchmarks for the past year or so with little in the way of competition.Source:Large Model Systems Organization.


One of the most popular benchmarks is called the LMSYS Chatbot Arena. It tests models on a variety of tasks and assigns an overall competency score. GPT-4o received a score of 1,286 while Claude-3 earned a respectable 1,271.


A previous version of Gemini 1.5 Pro scored 1,261. But the experimental version (Gemini 1.5 Pro 0801) released on Aug 1 scored a whopping 1,300.


This indicates that it’s overall more capable than its competitors, but benchmarks aren’t necessarily an accurate representation of what an AI model can and can’t do.Community excitement


Without deeper comparisons available, we’re entering an era where the AI chatbot market has matured enough to offer multiple options. It’s ultimately up to end-users to determine which AI model works best for them.


Anecdotally, there’s been a wave of excitement over the latest version of Gemini with users on social media calling it “insanely good.” One Redditor went so far as to write that it “blows 4o out of the water.”


It’s unclear at this time if the experimental version of Gemini 1.5 Pro will end up being the default going forward. While it remains generally available as of the time of this article’s publication, the fact that it’s in what"s considered an early release or testing phase indicates that it’s possible the model could be rescinded or changed for safety or alignment reasons.


Related:Google announces safety, transparency advancements in AI models# Google# Technology# AI# ChatGPT# OpenAIAdd reaction

News Feed

Roaring Kitty fraud lawsuit over GameStop dropped after 3 days
Tom Mitchelhill7 hours agoRoaring Kitty fraud lawsuit over GameStop dropped after 3 daysA GameStop investor who accused Roaring Kitty of committing securities fraud has voluntarily dropped the complaint “without prejud
Pension Funds Double Crypto Asset Exposure in Morgan Creek’s Fund to 1%
Morgan Creek Digital now takes up around 1 percent of the assets of two Fairfax Retirement System pension funds – an investment which has more than doubled since taking their first position in the fund that closed in F
Bitcoin, Ethereum Technical Analysis: Bitcoin Hits $43,000 After Tesla Announcement
Bitcoin, Ethereum Technical Analysis: Bitcoin Hits $43,000 After Tesla Announcement Following a strong weekend, bitcoin’s surge continued to start the week, as Tesla announc
Bitcoin price takes liquidity near $69K as gold surge rattles markets
William Suberg8 hours agoBitcoin price takes liquidity near $69K as gold surge rattles marketsBitcoin fails to follow gold to new all-time highs into the end of the week with BTC price momentum instead heading toward $69
Biggest Movers: XRP Rises 7%, While LUNA Falls by the Same Amount
Biggest Movers: XRP Rises 7%, While LUNA Falls by the Same Amount Although cryptocurrency markets started the weekend lower, XRP was able to avoid the red wave, rising by as much a
Prashant Jha11 hours agoBitcoin, Ether price slump leads to crypto bloodbath with $1B in liquidationsThe liquidation event saw one trader lose $55.9 million, while another saw $10 million worth of hedged positions get li
Solana activity flips Ethereum amid memecoin craze, even as txs fail
Tom Mitchelhill7 hours agoSolana activity flips Ethereum amid memecoin craze, even as txs failDegens were piling into Solana-based memecoins such as Book of Meme and SNAP as the network struggled to keep up with the surg
Data Lake Secures First Blockchain-Based Consents for Medical Data
Data Lake Secures First Blockchain-Based Consents for Medical Data press release PRESS RELEASE. December 2th, Warsaw – Poland: Data Lake has collected the first consents on th
Is It Time To Give Up On Ethereum Below $4,000? Analyst Weighs The Facts
Este artículo también está disponible en español. Crypto analyst Ali Martinezhas discussed Ethereum current price action as the second largest crypto by market cap remain
Korean VC Firm Daesung Private Equity Announces $83 Million Metaverse Fund
Korean VC Firm Daesung Private Equity Announces $83 Million Metaverse Fund Daesung Private Equity, a Korean venture capital firm, has announced the launch of a metaverse fund of 11
Web3 Company Animoca Brands Lowers Fundraising Goal to $1 Billion in Q1 2023
Web3 Company Animoca Brands Lowers Fundraising Goal to $1 Billion in Q1 2023 Animoca Brands, a Web3 gaming-focused company, has announced it is now targeting a raise of $1 billion
FTX Slashes Leverage Limit from 100x to 20x — Community Suspects Competitors Will Follow Example
FTX Slashes Leverage Limit from 100x to 20x — Community Suspects Competitors Will Follow Example FTX CEO Sam Bankman-Fried told his Twitter followers on Sunday