
Nvidia Unveils New AI Model Outperforming GPT-4o
On October 15, Nvidia discreetly introduced a new artificial intelligence model, claiming it outperforms other leading AI systems, including GPT-4o and Claude-3. This information was shared through a post on the X.com social media platform by the Nvidia AI Developer account. The new model, named Llama-3.1-Nemotron-70B-Instruct, is reportedly a leading model on lmarena.AI’s Chatbot Arena.
Introducing Nemotron
Llama-3.1-Nemotron-70B-Instruct is essentially a modified version of Meta's open-source Llama-3.1-70B-Instruct. The "Nemotron" part of the model's name represents Nvidia's contribution to the final product. The Llama "herd" of AI models, as Meta calls them, are intended to serve as open-source foundations for developers to build upon.
Nvidia accepted the challenge and developed Nemotron, a system designed to be more "helpful" than popular models like OpenAI’s ChatGPT and Anthropic’s Claude-3. Nvidia used specially curated datasets, advanced fine-tuning methods, and its own cutting-edge AI hardware to transform Meta's basic model into what could be the most "helpful" AI model in existence.
Benchmarking AI Models
Determining the "best" AI model is not straightforward. Unlike measuring temperature with a thermometer, there isn't a single "truth" when it comes to AI model performance. Developers and researchers assess how well an AI model performs through comparative testing, similar to how humans are evaluated.
AI benchmarking involves giving different AI models the same queries, tasks, questions, or problems and then comparing the usefulness of the results. Often, due to the subjectivity of what is and isn’t considered useful, human proctors are used to determine a machine’s performance through blind evaluations.
In the case of Nemotron, Nvidia claims that the new model outperforms existing top-tier models such as GPT-4o and Claude-3 by a significant margin.
Chatbot Arena Leaderboards
The image above shows the ratings on the automated "Hard" test on the Chatbot Arena Leaderboards. Although Nvidia's Llama-3.1-Nemotron-70B-Instruct doesn't appear to be listed on the boards, if the developer's claim that it scored an 85 on this test is valid, it would be the top model in this particular section.
What's interesting is that Llama-3.1-70B is Meta's mid-tier open-source AI model. There's a larger version of Llama-3.1, the 405B version (where the number refers to how many billion parameters the model was tuned with). In contrast, GPT-4o is estimated to have been developed with over one trillion parameters.
Bottom Line
Nvidia's new AI model, Nemotron, is making waves in the tech world for its purported superior performance over other leading AI models. As AI technology continues to evolve, it's fascinating to see how companies like Nvidia are pushing the boundaries of what's possible. What are your thoughts on this development? Do you think Nvidia's claims about Nemotron's performance are valid? Share your thoughts and this article with your friends. Don't forget to sign up for the Daily Briefing, which is delivered every day at 6 pm.