Dan Ferguson | VFF - The signal in the noise

NVIDIA Nemotron 3 Ultra Arrives on AWS SageMaker

AWS has made NVIDIA's Nemotron 3 Ultra model available on Amazon SageMaker JumpStart with one-click deployment. The 550-billion-parameter model uses a hybrid Transformer-Mamba architecture that activates only 55 billion parameters per forward pass, delivering 5x faster inference and up to 30% lower costs for agentic AI workloads. The model supports up to 1 million token context length and is optimized for NVFP4 precision format.

by Dan Ferguson6 days ago· AWS Machine Learning Blog

Source

MultimodalNews

NVIDIA Nemotron 3 Nano Omni Consolidates Multimodal AI for Agents

NVIDIA and AWS announced day-zero availability of Nemotron 3 Nano Omni on Amazon SageMaker JumpStart, a 30-billion-parameter multimodal model that processes video, audio, images, and text in a single inference pass. The model combines a language backbone, vision encoder, and speech encoder into a unified architecture supporting 131K token context length and various enterprise capabilities like chain-of-thought reasoning and tool calling. This addresses a key pain point in agentic systems, which currently stitch together separate models for different modalities, increasing latency and complexity. The model is available in FP8 precision and licensed under NVIDIA's Open Model Agreement for commercial use.

by Dan Fergusonabout 1 month ago· AWS Machine Learning Blog

Source