LLM Models Available for Fine-tuning

This table provides an overview of the Large Language Models (LLMs) available for fine-tuning, ordered approximately from most well-known to least familiar. It lists key details for each model, including its name, family, parameter count, context length, and additional features.

Model NameFamilyParameters (B)Context LengthvLLM SupportLoRA Support
Nvidia-Llama-3.1-Nemotron-70B-Instruct-HFllama3.170131,072YesYes
Meta-Llama-3.2-3B-Instructllama3.23131,072YesYes
Meta-Llama-3.2-1B-Instructllama3.21131,072YesYes
Mistral-Small-Instruct-2409mistral7.2128,000YesYes
Ministral-8B-Instruct-2410mistral8128,000YesYes
Mathstral-7B-v0.1mistral732,000YesYes
Qwen2.5-Coder-7B-Instructqwen2.5732,768YesYes
Aya-Expanse-32baya32128,000YesNo
Aya-Expanse-8baya88,000YesNo
Nemotron-Mini-4B-Instructnemotron44,096YesNo
Gemma-2-2b-itgemma228,192YesYes
Meta-Llama-3.1-70B-Instructllama3.170131,072YesYes
Meta-Llama-3.1-70B-Instructllama3.170131,072YesYes
Meta-Llama-3.1-70Bllama3.170131,072YesYes
Meta-Llama-3.1-8B-Instructllama3.18131,072YesYes
Meta-Llama-3.1-8Bllama3.18131,072YesYes
Meta-Llama-3-70B-Instructllama3708,192YesYes
Meta-Llama-3-70Bllama3708,192YesYes
Meta-Llama-3-8B-Instructllama388,192YesYes
Meta-Llama-3-8Bllama388,192YesYes
Mixtral-8x7B-Instruct-v0.1mixtral46.732,768YesYes
Mistral-7B-Instruct-v0.3mistral7.232,768YesYes
Mistral-Nemo-Instruct-2407mistral12.2128,000NoNo
Mistral-Nemo-Base-2407mistral12.2128,000NoNo
Gemma-2-27b-itgemma2278,192YesYes
Gemma-2-27bgemma2278,192YesYes
Gemma-2-9b-itgemma298,192YesYes
Gemma-2-9bgemma298,192YesYes
Phi-3-medium-128k-instructphi314128,000YesNo
Phi-3-medium-4k-instructphi3144,000YesNo
Phi-3-small-128k-instructphi37.4128,000YesNo
Phi-3-small-8k-instructphi37.48,000YesNo
Phi-3-mini-128k-instructphi33.8128,000YesNo
Phi-3-mini-4k-instructphi33.84,096YesNo
Qwen2-72B-Instructqwen27232,768YesYes
Qwen2-72Bqwen27232,768YesYes
Qwen2-57B-A14B-Instructqwen25732,768YesYes
Qwen2-57B-A14Bqwen25732,768YesYes
Qwen2-7B-Instructqwen2732,768YesYes
Qwen2-7Bqwen2732,768YesYes
Qwen2-1.5B-Instructqwen21.532,768YesYes
Qwen2-1.5Bqwen21.532,768YesYes
Qwen2-0.5B-Instructqwen20.532,768YesYes
Qwen2-0.5Bqwen20.532,768YesYes
TinyLlama_v1.1tinyllama1.12,048NoNo
DeepSeek-Coder-V2-Lite-Basedeepseek-coder-v216163,840NoNo
InternLM2_5-7B-Chatinternlm2.57.741,000,000YesNo
InternLM2_5-7Binternlm2.57.741,000,000YesNo
Jamba-v0.1jamba51.6256,000YesYes
Yi-1.5-34B-Chatyi-1.534.44,000YesYes
Yi-1.5-34Byi-1.534.44,000YesYes
Yi-1.5-34B-32Kyi-1.534.432,000YesYes
Yi-1.5-34B-Chat-16Kyi-1.534.416,000YesYes
Yi-1.5-9B-Chatyi-1.58.834,000YesYes
Yi-1.5-9Byi-1.58.834,000YesYes
Yi-1.5-9B-32Kyi-1.58.8332,000YesYes
Yi-1.5-9B-Chat-16Kyi-1.58.8316,000YesYes
Yi-1.5-6B-Chatyi-1.564,000YesYes
Yi-1.5-6Byi-1.564,000YesYes
c4ai-command-r-v01command-r35131,072YesNo

Notes:

  • “vLLM Support” indicates whether the model is compatible with the vLLM (very Large Language Model) inference framework.
  • “LoRA Support” indicates if the vLLM support inference the model with multiple LorA Adapters. Read more
  • Context length is measured in tokens. (The model context can change by the target inference library)
  • Parameter count is shown in billions (B).
  • Links lead to the model’s page on Hugging Face or the official website when available.

This table provides a comprehensive overview of the available models, their sizes, capabilities, and support for various fine-tuning techniques. When choosing a model for fine-tuning, consider factors such as the model size, context length, and support for specific optimization techniques like vLLM and LoRA.