Model Name | Family | Parameters (B) | Context Length | vLLM Support | LoRA Support |
---|---|---|---|---|---|
Nvidia-Llama-3.1-Nemotron-70B-Instruct-HF | llama3.1 | 70 | 131,072 | Yes | Yes |
Meta-Llama-3.2-3B-Instruct | llama3.2 | 3 | 131,072 | Yes | Yes |
Meta-Llama-3.2-1B-Instruct | llama3.2 | 1 | 131,072 | Yes | Yes |
Mistral-Small-Instruct-2409 | mistral | 7.2 | 128,000 | Yes | Yes |
Ministral-8B-Instruct-2410 | mistral | 8 | 128,000 | Yes | Yes |
Mathstral-7B-v0.1 | mistral | 7 | 32,000 | Yes | Yes |
Qwen2.5-Coder-7B-Instruct | qwen2.5 | 7 | 32,768 | Yes | Yes |
Aya-Expanse-32b | aya | 32 | 128,000 | Yes | No |
Aya-Expanse-8b | aya | 8 | 8,000 | Yes | No |
Nemotron-Mini-4B-Instruct | nemotron | 4 | 4,096 | Yes | No |
Gemma-2-2b-it | gemma2 | 2 | 8,192 | Yes | Yes |
Meta-Llama-3.1-70B-Instruct | llama3.1 | 70 | 131,072 | Yes | Yes |
Meta-Llama-3.1-70B-Instruct | llama3.1 | 70 | 131,072 | Yes | Yes |
Meta-Llama-3.1-70B | llama3.1 | 70 | 131,072 | Yes | Yes |
Meta-Llama-3.1-8B-Instruct | llama3.1 | 8 | 131,072 | Yes | Yes |
Meta-Llama-3.1-8B | llama3.1 | 8 | 131,072 | Yes | Yes |
Meta-Llama-3-70B-Instruct | llama3 | 70 | 8,192 | Yes | Yes |
Meta-Llama-3-70B | llama3 | 70 | 8,192 | Yes | Yes |
Meta-Llama-3-8B-Instruct | llama3 | 8 | 8,192 | Yes | Yes |
Meta-Llama-3-8B | llama3 | 8 | 8,192 | Yes | Yes |
Mixtral-8x7B-Instruct-v0.1 | mixtral | 46.7 | 32,768 | Yes | Yes |
Mistral-7B-Instruct-v0.3 | mistral | 7.2 | 32,768 | Yes | Yes |
Mistral-Nemo-Instruct-2407 | mistral | 12.2 | 128,000 | No | No |
Mistral-Nemo-Base-2407 | mistral | 12.2 | 128,000 | No | No |
Gemma-2-27b-it | gemma2 | 27 | 8,192 | Yes | Yes |
Gemma-2-27b | gemma2 | 27 | 8,192 | Yes | Yes |
Gemma-2-9b-it | gemma2 | 9 | 8,192 | Yes | Yes |
Gemma-2-9b | gemma2 | 9 | 8,192 | Yes | Yes |
Phi-3-medium-128k-instruct | phi3 | 14 | 128,000 | Yes | No |
Phi-3-medium-4k-instruct | phi3 | 14 | 4,000 | Yes | No |
Phi-3-small-128k-instruct | phi3 | 7.4 | 128,000 | Yes | No |
Phi-3-small-8k-instruct | phi3 | 7.4 | 8,000 | Yes | No |
Phi-3-mini-128k-instruct | phi3 | 3.8 | 128,000 | Yes | No |
Phi-3-mini-4k-instruct | phi3 | 3.8 | 4,096 | Yes | No |
Qwen2-72B-Instruct | qwen2 | 72 | 32,768 | Yes | Yes |
Qwen2-72B | qwen2 | 72 | 32,768 | Yes | Yes |
Qwen2-57B-A14B-Instruct | qwen2 | 57 | 32,768 | Yes | Yes |
Qwen2-57B-A14B | qwen2 | 57 | 32,768 | Yes | Yes |
Qwen2-7B-Instruct | qwen2 | 7 | 32,768 | Yes | Yes |
Qwen2-7B | qwen2 | 7 | 32,768 | Yes | Yes |
Qwen2-1.5B-Instruct | qwen2 | 1.5 | 32,768 | Yes | Yes |
Qwen2-1.5B | qwen2 | 1.5 | 32,768 | Yes | Yes |
Qwen2-0.5B-Instruct | qwen2 | 0.5 | 32,768 | Yes | Yes |
Qwen2-0.5B | qwen2 | 0.5 | 32,768 | Yes | Yes |
TinyLlama_v1.1 | tinyllama | 1.1 | 2,048 | No | No |
DeepSeek-Coder-V2-Lite-Base | deepseek-coder-v2 | 16 | 163,840 | No | No |
InternLM2_5-7B-Chat | internlm2.5 | 7.74 | 1,000,000 | Yes | No |
InternLM2_5-7B | internlm2.5 | 7.74 | 1,000,000 | Yes | No |
Jamba-v0.1 | jamba | 51.6 | 256,000 | Yes | Yes |
Yi-1.5-34B-Chat | yi-1.5 | 34.4 | 4,000 | Yes | Yes |
Yi-1.5-34B | yi-1.5 | 34.4 | 4,000 | Yes | Yes |
Yi-1.5-34B-32K | yi-1.5 | 34.4 | 32,000 | Yes | Yes |
Yi-1.5-34B-Chat-16K | yi-1.5 | 34.4 | 16,000 | Yes | Yes |
Yi-1.5-9B-Chat | yi-1.5 | 8.83 | 4,000 | Yes | Yes |
Yi-1.5-9B | yi-1.5 | 8.83 | 4,000 | Yes | Yes |
Yi-1.5-9B-32K | yi-1.5 | 8.83 | 32,000 | Yes | Yes |
Yi-1.5-9B-Chat-16K | yi-1.5 | 8.83 | 16,000 | Yes | Yes |
Yi-1.5-6B-Chat | yi-1.5 | 6 | 4,000 | Yes | Yes |
Yi-1.5-6B | yi-1.5 | 6 | 4,000 | Yes | Yes |
c4ai-command-r-v01 | command-r | 35 | 131,072 | Yes | No |