Enabling Maximum Performance Mode on NVIDIA Jetson AGX Orin 64 GB

This article explains how to configure an NVIDIA Jetson AGX Orin 64 GB Developer Kit running Ubuntu 22.04.5 LTS and JetPack 6.2.2 to operate in maximum performance mode for AI workloads, especially LLM inference.

đź’ˇ

Why it matters

Enabling maximum performance on the Jetson AGX Orin is crucial for achieving high-throughput inference on large language models and other compute-intensive AI applications.

Key Points

  • 1Describes how to select the MAXN power mode and lock system clocks at their highest frequencies
  • 2Verifies the configuration using NVIDIA tools and simple benchmarks
  • 3Documents the practical impact of enabling MAXN and jetson_clocks, showing a 3x increase in GPU frequency and token generation throughput
  • 4Covers how to persist these settings using a systemd service for consistent high-performance operation

Details

The article targets users who want reproducible, high-throughput inference on a Jetson AGX Orin while retaining awareness of thermal and power constraints. It explains the hardware and software environment, including the Jetson AGX Orin 64 GB configuration with up to 275 TOPS performance in MAXN mode. The guide covers inspecting and selecting the MAXN power mode, as well as locking CPU, GPU, and memory clocks to their maximum frequencies using NVIDIA's nvpmodel and jetson_clocks tools. This configuration can increase LLM inference throughput from 8 tokens per second to 18-25 tokens per second, a 3x improvement. The article also discusses persisting the high-performance settings using a systemd service to ensure the device consistently boots into a suitable state for heavy AI workloads.

Like
Save
Read original
Cached
Comments
?

No comments yet

Be the first to comment

AI Curator - Daily AI News Curation

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies