AI Adopters
Posts
Nvidia’s LLM beats DeepSeek R1

Nvidia’s LLM beats DeepSeek R1

Quick, efficient, and already topping major benchmarks.

Mike
April 08, 2025

Welcome back.

Nvidia just dropped a powerful new language model that beats DeepSeek R1 at half the size.

Fully open-source and optimized for efficiency, it's built specifically for high-performance AI agents, code tasks, and multilingual chat.

Let’s break it down.

In today’s release:

1. Nvidia’s new LLM outperforms DeepSeek R1

2. Anthropic releases an Education Report

3. Amazon unveils Nova Sonic

Nvidia’s new LLM outperforms DeepSeek R1 at half the size

Nvidia has released Llama-3.1-Nemotron-Ultra-253B, a dense 253B parameter model built on Meta’s Llama 3.1 and optimized for fast, efficient reasoning. Despite being less than half the size of DeepSeek R1, the model beats it in several key reasoning and coding benchmarks.

It’s fully open source, licensed for commercial use, and already live on Hugging Face. The release is aimed squarely at developers building advanced AI agents, RAG systems, and chat assistants that need reliable instruction-following and multi-language support.

Performance: Beats DeepSeek R1 on GPQA (76.01% vs. 71.5%) and IFEval (89.45% vs. 83.3%).
Code and logic: Tops LiveCodeBench (66.31% vs. 65.9%) while trailing slightly on AIME25 and MATH500.
Architecture: Dense model optimized via Neural Architecture Search; runs on a single 8x H100 node.
Toggle reasoning: Supports reasoning ON/OFF via prompts.

The model supports 128k token sequences and includes multilingual support (e.g., English, German, Spanish). It's suitable for everything from chatbots and AI agents to code gen and RAG pipelines.

How can you use this to your advantage?

If you're building with open models but need high performance and tight reasoning, Nemotron Ultra is a cost-effective and powerful alternative to massive MoE models like DeepSeek R1. With commercial use approved and fine-tuning options open, it's ready for deployment.

How university students really use Claude

Anthropic analyzed over one million anonymized Claude conversations tied to university email accounts and found that students aren’t just using AI for shortcuts; they’re using it as a real academic tool. The data shows high engagement in STEM, especially Computer Science, and a wide mix of use cases that reflect both learning support and potential risks to academic integrity.

Most students use Claude for tasks like creating study guides, solving technical problems, or co-writing essays and research materials. Usage patterns break down into four main types: quick answers, output generation, collaborative problem solving, and joint content creation, with each making up roughly a quarter of all interactions.

Adoption: Computer Science students made up 36.8% of usage; Business and Health students were underrepresented.
Task types: Content creation (39.3%) and technical problem-solving (33.5%) dominated.
Cognitive load: Claude often handled higher-order thinking tasks like analyzing and creating rather than basic recall.
Integrity concerns: 47% of interactions were direct queries, with some likely used for cheating (e.g., answering test questions).

The report highlights both the growing dependency on AI in academic life and the need for clearer guidelines on ethical usage. Claude’s current strengths are best suited for STEM-heavy workloads, but its role is quickly expanding across disciplines.

How can you use this to your advantage?

If you're a student, Claude can be more than just a shortcut, it’s a tool to deepen your understanding. Use it to brainstorm ideas, get step-by-step guidance on tough concepts, or co-edit drafts with real-time feedback.

Amazon unveils Nova Sonic: a real-time voice model

Amazon has launched Nova Sonic, a powerful new voice AI model that brings natural, two-way speech interaction to third-party apps via Amazon Bedrock. It merges speech-to-text, language understanding, and text-to-speech into a single, low-latency system, making it ideal for customer service, education, and enterprise tools.

With performance that rivals and, in some cases, beats GPT-4o and Gemini Flash 2.0, Nova Sonic is designed to be fast, expressive, cost-effective, and easy to integrate into real-world workflows.

Performance: Outperforms GPT-4o and Gemini Flash 2.0 in U.S. and British English benchmarks.
Voice interaction: Handles real-time conversation, interruptions, and expressive delivery.
Integration: Works with APIs and tools to trigger actions like bookings or data lookups.
Speed and cost: 1.09s latency (faster than rivals) and ~80% cheaper than GPT-4o real-time.

Nova Sonic supports both masculine and feminine voices, with more languages and accents on the way. Enterprises in education, sports, and customer support are already putting it to use.

How can you use this to your advantage?

If you're building tools that rely on voice, like customer agents, AI tutors, or even voice dashboards, Nova Sonic gives you a ready-to-deploy system that’s fast, fluent, and developer-friendly. Its real-time API means you can easily build responsive voice experiences without gluing together multiple models or paying GPT-level prices.

OTHER AI NEWS

Amazon’s Nova Reel can generate 2-minute AI videos

Amazon’s Nova Reel 1.1 can now produce up to two-minute, multi-shot videos with consistent style and a new manual mode for more creative control. The model is available through AWS but requires special access to use.

Runway launches Gen-4 Turbo

Runway has released Gen-4 Turbo, a faster and more efficient version of its video AI model that can generate 10-second videos in just 30 seconds. It’s now available on all paid plans and uses fewer credits per second than the standard Gen-4 model.

Rescale raises $115M for AI engineering

Rescale landed $115M to boost its AI tools that speed up engineering simulations from days to seconds. Backed by Bezos, Altman, and Nvidia, the startup aims to modernize product design with cloud computing and “AI physics”.

ElevenLabs launches an MCP Server

ElevenLabs now offers an MCP server that lets tools like Claude and Cursor access its full AI audio suite via text prompts. Users can generate speech, transcribe audio, design voices, and deploy AI agents for tasks like making phone calls.

Meta faces backlash over Llama 4 launch

Meta released Llama 4 over the weekend, but early users report mixed quality, bugs, and poor performance on key benchmarks. Meta denies training on test sets and blames rushed implementations, promising fixes ahead of its LlamaCon event on April 29.

POPULAR AI TOOLS

UX Pilot → Superfast UX/UI Design with AI.
EZsite AI → Build AI apps that ship revenue in seconds.
Metabase Embedded Analytics → Get in-product analytics quickly without waiting for sprints.
Agno → Build lightning-fast, multi-modal Reasoning Agents.
taatoo → Invisible watermarking to protect your images.

AND THAT’S A WRAP

Thank you for reading!

If you found this email useful, share it with a friend or colleague who also loves AI.

Also, drop me a follow on Twitter/X for more AI and tech updates.

I will talk to you soon!

Mike