Moonshot AI: Moonlight 16B A3B Instruct (free)

Other

Text

Free

Moonlight-16B-A3B-Instruct is a 16B-parameter Mixture-of-Experts (MoE) language model developed by Moonshot AI. It is optimized for instruction-following tasks with 3B activated parameters per inference. The model advances the Pareto frontier in performance per FLOP across English, coding, math, and Chinese benchmarks. It outperforms comparable models like Llama3-3B and Deepseek-v2-Lite while maintaining efficient deployment capabilities through Hugging Face integration and compatibility with popular inference engines like vLLM12.

Parameters

16B

Context Window

8,192

tokens

Input Price

per 1M tokens

Output Price

per 1M tokens

Capabilities

Model capabilities and supported modalities

Performance

Reasoning

Math

Strong mathematical capabilities, handles complex calculations well

Coding

Knowledge

Modalities

Input Modalities

text

Output Modalities

text

LLM Price Calculator

Calculate the cost of using this model

Input Tokens (0.00000000/token)$0.000000

Output Tokens (0.00000000/token)$0.000000

Common Scenarios:

Input Cost:$0.000000

Output Cost:$0.000000

Total Cost:$0.000000

Estimated usage: 4,500 tokens

Monthly Cost Estimator

Based on different usage levels

Light Usage

$0.0000

~10 requests

Moderate Usage

$0.0000

~100 requests

Heavy Usage

$0.0000

~1000 requests

Enterprise

$0.0000

~10,000 requests

Note: Estimates based on current token count settings per request.

Last Updated: 2025/05/06