AI-HUB | NVIDIA Mistral AI Joins Forces! The 12 Billion Parameter Small Model King Makes a Strong Debut, Crushing Llama 3

The GPT-4o mini has not yet settled into its throne, and Mistral AI, in collaboration with NVIDIA, has launched the latest and strongest small model—Mistral NeMo with 12 billion parameters, outperforming Gemma 2 9B and Llama 3 8B.

First, HuggingFace released the small model SmoLLM; then OpenAI directly entered the small model battlefield with the launch of GPT-4o mini.

On the same day that GPT-4o mini was released, Europe's strongest AI startup, Mistral, immediately unveiled its latest and most powerful small model—Mistral NeMo.

Mistral NeMo, developed by Mistral AI and NVIDIA, features 12 billion parameters and supports a context length of 128K.

Overall, Mistral NeMo has surpassed Gemma 2 9B and Llama 3 8B in multiple benchmark tests.

Just a few days ago, Mistral released two small models specifically designed for mathematical reasoning and scientific discovery: Mathstral 7B and the code model Codestral Mamba, which are among the first open-source models utilizing the Mamba 2 architecture.

The newly released small model Mistral NeMo 12B targets enterprise users.

Developers can easily customize and deploy enterprise applications that support chatbots, multilingual tasks, coding, and summarization.

By combining Mistral AI's expertise in training data with NVIDIA's optimized hardware and software ecosystem, Mistral NeMo is positioned for success.

Guillaume Lample, co-founder and chief scientist of Mistral AI, stated, "We are fortunate to collaborate with the NVIDIA team, leveraging their top-notch hardware and software."

Mistral NeMo was trained on the NVIDIA DGX Cloud AI platform, which provides dedicated and scalable access to the latest NVIDIA architectures.

NVIDIA TensorRT-LLM, which accelerates large language model inference performance, as well as the NVIDIA NeMo development platform for building custom generative AI models, were also utilized to advance and optimize the new model's performance.

This collaboration highlights NVIDIA's commitment to supporting the model builder ecosystem.

Enterprise Track, Outstanding Performance

Mistral NeMo supports a context length of 128K, enabling it to process a wide range of complex information more coherently and accurately, ensuring that outputs are relevant to the context.

Compared to models of similar parameter size, it leads in inference, world knowledge, and coding accuracy.

As shown in the table below, except for the MMLU benchmark, Mistral NeMo does not outperform Gemma 2 9B.

However, it surpasses Gemma 2 9B and Llama 3 8B in benchmarks related to multi-turn dialogue, mathematics, common sense reasoning, world knowledge, and coding.

Due to its use of a standard architecture, Mistral NeMo is highly compatible, easy to use, and can directly replace any system utilizing Mistral 7B.

Mistral NeMo is a model with 12 billion parameters, released under the Apache 2.0 license, allowing anyone to download and use it.

Additionally, the model uses FP8 data format for inference, which reduces memory size and speeds up deployment without compromising accuracy.

This means the model can smoothly learn tasks and handle various scenarios more effectively, making it an ideal choice for enterprises.

This format allows for easy deployment anywhere, enabling various applications to be flexibly utilized.

As a result, the model can be deployed anywhere in just a few minutes, eliminating the hassle of waiting and device limitations.

Mistral NeMo targets enterprise users, utilizing enterprise-grade software that is part of NVIDIA AI Enterprise, featuring dedicated functionality branches, strict validation processes, and support for enterprise-level security.

The open model license also allows enterprises to seamlessly integrate Mistral NeMo into commercial applications.

Mistral NeMo NIM is designed for installation on a single NVIDIA L40S, NVIDIA GeForce RTX 4090, or NVIDIA RTX 4500 GPU's memory, ensuring high efficiency at low cost while guaranteeing security and privacy.

Development and Customization of Advanced Models

Mistral AI and NVIDIA have combined their respective areas of expertise to optimize the training and inference of Mistral NeMo.

The model leverages Mistral AI's expertise in training, particularly in multilingual, coding, and multi-turn content, benefiting from NVIDIA's full-stack accelerated training.

It is designed for optimal performance, utilizing efficient model parallelism, scalability, and mixed precision with Megatron-LM.

The model is trained using a part of NVIDIA NeMo's Megatron-LM on DGX Cloud, equipped with 3,072 H100 80GB Tensor Core GPUs, comprising NVIDIA AI architecture, including accelerated computing, network structure, and software to enhance training efficiency.

Multilingual Model for the Masses

The Mistral NeMo model is specifically designed for global multilingual applications.

It has been trained with function calls, features a large context window, and performs robustly in multiple languages, including English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi.

This can be seen as a significant step in bringing cutting-edge AI models to users of different languages around the world.

Performance of Mistral NeMo in Multilingual Benchmark Tests.

Similar ToNVIDIA Mistral AI Joins Forces! The 12 Billion Parameter Small Model King Makes a Strong Debut, Crushing Llama 3

The former CEO of iRobot has returned to the robotics field and launched a home robot for health companionship

Meta has launched an open-source AI virtual fitting model called Leffa: Retaining More Details

A new feature has been launched on YouTube: Creators are allowed to authorize third parties to use their videos to train AI

Midjourney Launches the World-Building Tool "Patchwork" for Multi-Person Collaboration, Supporting 100 People to Operate on the Same Canvas

Google Launches New AI Tool Deep Research to Help Users Conduct Online Research Easily