Technical Report on Training an Uncensored LLM

Technical Report on Training an Uncensored LLM

Introduction

This document provides a step-by-step guide for training an uncensored large language model, referred to as CAI Proprietary LLM, using fine-tuning techniques. The training pipeline includes dataset curation, supervised fine-tuning (SFT), and direct preference optimization (DPO).

Model Overview

CAI Proprietary LLM is a highly steerable instruct and chat-tuned model developed using a base Llama 3.1 model. The model follows system prompts precisely and can be configured to respond to various requests without censorship. It supports functionalities such as structured reasoning, retrieval-augmented generation (RAG), tool use, and long-context interactions.

Supported model sizes:

  • 8B parameters
  • 70B parameters
  • 405B parameters

Each version retains the core steerability properties and can be fine-tuned based on specific requirements.

Dataset Curation

Training CAI Proprietary LLM requires a high-quality instruction dataset. The dataset should cover a broad range of domains to enhance model adaptability. The data mixture should be structured as follows:

Technical Report on Training an Uncensored LLM

Filtering Process

To ensure dataset quality:

  • Remove low-quality responses.
  • Filter out empty turns or improperly formatted conversations.
  • Prioritize high-quality synthetic data over noisy real-world data.
  • Apply token length thresholds for balancing sample lengths.

Training Pipeline

Supervised Fine-Tuning (SFT)

SFT is conducted on base models (Llama 3.1 8B, 70B, 405B) using a mixture of instruction-tuned datasets.

Hyperparameters

  • Optimizer: AdamW
  • Weight Decay: 0.01
  • Peak Learning Rate: 7 × 10⁻⁶ (8B/70B), 3.5 × 10⁻⁶ (405B)
  • Warmup Steps: 300
  • Training Epochs: 4
  • Context Length: 8192 tokens
  • Batch Size: 48 (8B/70B), 128 (405B)
  • Training Framework: Axolotl (customized for efficiency)

Training Infrastructure

  • 8B & 70B Models: 6 HGX4 nodes (48 GPUs) with PyTorch FSDP.
  • 405B Model: Minimum 16 HGX nodes (128 GPUs) with CPU offloading.

Note: Efficient sample packing using Flash Attention 2 helps maximize training efficiency.

Direct Preference Optimization (DPO)

DPO is applied after SFT using a LoRA-based approach to fine-tune user preferences while reducing computational overhead.

Hyperparameters

  • Adapter Type: LoRA (Low-Rank Adaptation)
  • LoRA Rank: r = 32
  • Scaling Factor (α): 16
  • Dropout: 0.05
  • Optimizer: RMSProp
  • Peak Learning Rate: 3 × 10⁻⁶
  • Warmup Steps: 9
  • Training Framework: NEFTune

DPO fine-tuning results in moderate performance improvements in user-aligned responses while maintaining uncensored behavior.

Model Deployment Considerations

Quantization & Inference Optimization

For efficient inference, models are quantized using FP8 rounding with the llm-compressor library in vLLM.

Safety & Policy Controls

  • Model Alignment: Default behavior is neutral, without restrictive safety layers.
  • System-Level Controls: Guardrails should be implemented at the application layer rather than modifying the model weights.

Evaluation Benchmarks

Final model evaluations are conducted using the following benchmarks:

Technical Report on Training an Uncensored LLM

These benchmarks validate the model’s performance on general reasoning, factual knowledge, and instruction-following tasks.

Conclusion

By following this guide, developers can successfully train an uncensored LLM using CAI Proprietary LLM. The training pipeline ensures high performance across reasoning, generation, and tool-augmented tasks while maintaining neutrality in responses.

For further optimizations, developers can explore:

  • Scaling to larger context windows
  • Advanced RAG integration
  • Multi-modal fine-tuning (text & images)

For additional inquiries or implementation support, refer to the open-source repositories used in training or customize the pipeline to fit specific project needs.

Technical Report on Training an Uncensored LLM
Technical Report on Training an Uncensored LLM