Qwerky 72B (free)
Other
Text
Free
Qwerky-72B is a linear-attention RWKV variant of the Qwen 2.5 72B model, optimized to significantly reduce computational cost at scale. Leveraging linear attention, it achieves substantial inference speedups (>1000x) while retaining competitive accuracy on common benchmarks like ARC, HellaSwag, Lambada, and MMLU. It inherits knowledge and language support from Qwen 2.5, supporting approximately 30 languages, making it suitable for efficient inference in large-context applications.
Parameters
72B
Context Window
32,768
tokens
Input Price
$0
per 1M tokens
Output Price
$0
per 1M tokens
Capabilities
Model capabilities and supported modalities
Performance
Reasoning
-
Math
-
Coding
-
Knowledge
Extensive knowledge base with broad coverage of topics
Modalities
Input Modalities
text
Output Modalities
text
LLM Price Calculator
Calculate the cost of using this model
$0.000000
$0.000000
Input Cost:$0.000000
Output Cost:$0.000000
Total Cost:$0.000000
Estimated usage: 4,500 tokens
Monthly Cost Estimator
Based on different usage levels
Light Usage
$0.0000
~10 requests
Moderate Usage
$0.0000
~100 requests
Heavy Usage
$0.0000
~1000 requests
Enterprise
$0.0000
~10,000 requests
Note: Estimates based on current token count settings per request.
Last Updated: 2025/05/06