·
©
Efficient Mixture-of-Experts model optimized for strong performance at lower cost.
Efficient MoE Design. ~26B total parameters with ~4B active per token. Delivers near large-model quality at significantly lower compute cost.
Instruction Tuned. Built for chat, structured outputs, and general assistant use cases.
Cost-Performance Sweet Spot. Much cheaper to run than dense models of similar quality, making it practical for production workloads.
Apache 2.0 License. Fully open and permissive for commercial deployment.
Gemma 4 26B A4B IT is a great “default at scale” model - good quality, lower cost, and easy to run. Ideal when you need to serve many requests without paying for full dense models.
·
©