AMD Radeon AI PRO R9700 benchmarks with ROCm and Vulkan and llama.cpp
The article presents benchmarks for the AMD Radeon AI PRO R9700 GPU, running on Arch Linux with ROCm 7.1.1 and comparing the performance of ROCm and Vulkan APIs for language models like gpt-oss 20B and Mistral Small.
Why it matters
These benchmarks provide insights into the performance of AMD's Radeon AI PRO R9700 GPU for large language models, which is relevant for AI researchers and developers working on GPU-accelerated AI applications.
Key Points
- 1Benchmarks for novel summarization task using gpt-oss 20B and Mistral Small models
- 2Detailed performance metrics for prompt processing (PP), token generation (TG), and total time (T) under different batch sizes
- 3Comparison of ROCm and Vulkan APIs, with ROCm showing slightly faster prompt processing and less performance impact from long context
Details
The article presents benchmarks for the AMD Radeon AI PRO R9700 GPU, running on Arch Linux with ROCm 7.1.1. It compares the performance of ROCm and Vulkan APIs for language models like gpt-oss 20B and Mistral Small. For the novel summarization task, the gpt-oss 20B model with a batch size of 32 completed the task in 113 seconds, generating 18,000 output words, while the Mistral Small model with a batch of 3 took 479 seconds to generate 14,000 words. The detailed benchmarks show that ROCm usually has slightly faster prompt processing and takes less performance hit from long context, while Vulkan has slightly faster token generation. The author notes that the benchmark scripts were generated by the language model, so there may be some hallucinated values in the reported results.
No comments yet
Be the first to comment