Heretic Abliteration Tool Adds Universal Support for New HF Architectures

The Heretic tool, which automatically removes censorship and refusals in local language models, has been updated to support dynamic auto-registration for new Hugging Face model architectures.

💡

Why it matters

This update to the Heretic tool makes it easier to run and 'delobotomize' the latest language models, reducing censorship and refusals.

Key Points

  • 1Heretic now automatically parses config.json, imports the necessary classes, and registers new model architectures on-the-fly
  • 2This eliminates the need for manual patching when new models like GLM are released
  • 3Heretic was successfully tested on the GLM-4.6V-Flash multimodal 10B model, reducing the refusal rate from 100/100 to 63/100

Details

The Heretic tool is an automatic script that removes censorship and refusals from local language models while keeping the KL (Kullback-Leibler divergence) very low. The latest update adds dynamic auto-registration capabilities, allowing Heretic to automatically handle new or unsupported Hugging Face model architectures. When Transformers encounters an 'unrecognized config' error, Heretic now parses the config.json, imports the necessary config/auto/model classes, and registers them on-the-fly, enabling the model to be loaded successfully. This eliminates the need for manual patching every time a new model architecture like GLM is released. The update was tested on the GLM-4.6V-Flash multimodal 10B model, which loaded fine on a single Nvidia 4090 GPU, had a KL of 0.0000 (essentially identical to the original), and saw the refusal rate on 'spicy' prompts drop from 100/100 to 63/100, a significant improvement over previous Heretic versions.

Like
Save
Read original
Cached
Comments
?

No comments yet

Be the first to comment

AI Curator - Daily AI News Curation

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies