LocalLLaMA Reddit21h ago|Research & Papers Products & Services

Request: Training a Pretrained, MoE Version of Mistral Nemo

A student has converted the Mistral Nemo language model into a 16-expert Mixture-of-Experts (MoE) model, but due to budget constraints, cannot afford further fine-tuning. The student hopes someone will take interest in the model and provide a trained version.

💡

Why it matters

This request highlights the challenges faced by students and researchers with limited resources in developing and improving large language models.

Key Points

1Mistral Nemo has been converted from a dense model to a 16-expert MoE model
2The student has budget constraints and cannot afford full parameter or extended fine-tuning
3The model currently has issues with coherence and ignoring instructions
4If someone releases a trained version, the student can expand the expert pool and release a version with expanded parameter capacity

Details

The student has converted the Mistral Nemo language model, which was previously a dense model, into a 16-expert Mixture-of-Experts (MoE) model. This was done in an effort to improve the model's capabilities, but due to budget constraints and the use of a rental GPU, the student was unable to afford full parameter or extended fine-tuning. As a result, the model currently has issues with coherence and often ignores instructions. The student hopes that someone will take an interest in this model and provide a trained version, which would allow the student to expand the expert pool and release a version with expanded parameter capacity, effectively restoring the capabilities of the original Mistral Nemo model.

Request: Training a Pretrained, MoE Version of Mistral Nemo

Why it matters

Key Points

Details

Dive deeper

Related Articles

Kimi K2.5 Waits for Apps to Load by Taking Screenshots Cont…

MolmoWeb 4B/8B: Multimodal Web Agents Outperform Larger Mod…

pls: Say what you want in your terminal, get the shell comm…

Tracker of Companies Citing AI for Layoffs in 2026

LiteLLM Compromised: Developing Situation

Evaluating Local AI Options for Document Analysis and Writi…

Potential Malware Infection in LM Studio

Bringing NPCs to Life in Games with SillyTavern Extension

Litellm PyPI Packages Compromised, Users Advised Not to Upd…

White House Releases AI Policy Framework Criticized for Lac…

AI Curator

Ask me anything about AI

Related Articles

Kimi K2.5 Waits for Apps to Load by Taking Screenshots Cont…

MolmoWeb 4B/8B: Multimodal Web Agents Outperform Larger Mod…

pls: Say what you want in your terminal, get the shell comm…

Tracker of Companies Citing AI for Layoffs in 2026

LiteLLM Compromised: Developing Situation

Evaluating Local AI Options for Document Analysis and Writi…

Potential Malware Infection in LM Studio

Bringing NPCs to Life in Games with SillyTavern Extension

Litellm PyPI Packages Compromised, Users Advised Not to Upd…

White House Releases AI Policy Framework Criticized for Lac…