DeepSeek: R1 Distill Llama 8B
DeepSeek R1 Distill Llama 8B is a distilled large language model based on [Llama-3.1-8B-Instruct](/meta-llama/llama-3.1-8b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance across multiple benchmarks, including: - AIME 2024 pass@1: 50.4 - MATH-500 pass@1: 89.1 - CodeForces Rating: 1205 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models. Hugging Face: - [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) - [DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) |
Parameters
8B
Context Window
32,000
tokens
Input Price
$0.04
per 1M tokens
Output Price
$0.04
per 1M tokens
Capabilities
Model capabilities and supported modalities
Performance
Good reasoning with solid logical foundations
Strong mathematical capabilities, handles complex calculations well
Specialized in code generation with excellent programming capabilities
-
Modalities
text
text
LLM Price Calculator
Calculate the cost of using this model
Monthly Cost Estimator
Based on different usage levels