LogoTop AI Hubs

MoonshotAI: Kimi Linear 48B A3B Instruct

Other
Text
Paid

Kimi Linear is a hybrid linear attention architecture that outperforms traditional full attention methods across various contexts, including short, long, and reinforcement learning (RL) scaling regimes. At its core is Kimi Delta Attention (KDA)—a refined version of Gated DeltaNet that introduces a more efficient gating mechanism to optimize the use of finite-state RNN memory. Kimi Linear achieves superior performance and hardware efficiency, especially for long-context tasks. It reduces the need for large KV caches by up to 75% and boosts decoding throughput by up to 6x for contexts as long as 1M tokens.

Parameters

48B

Context Window

1,048,576

tokens

Input Price

$0.3

per 1M tokens

Output Price

$0.6

per 1M tokens

Capabilities

Model capabilities and supported modalities

Performance

Reasoning

-

Math

-

Coding

-

Knowledge

-

Modalities

Input Modalities

text

Output Modalities

text

LLM Price Calculator

Calculate the cost of using this model

$0.000450
$0.001800
Input Cost:$0.000450
Output Cost:$0.001800
Total Cost:$0.002250
Estimated usage: 4,500 tokens

Monthly Cost Estimator

Based on different usage levels

Light Usage
$0.0090
~10 requests
Moderate Usage
$0.0900
~100 requests
Heavy Usage
$0.9000
~1000 requests
Enterprise
$9.0000
~10,000 requests
Note: Estimates based on current token count settings per request.
Last Updated: 1970/01/21