All Models

SmolLM2-1.7B

SmolLM2-1.7B

Text

SmolLM 2 1.7 B (1.7 B params, Apache-2.0)

Little-footprint Llama-style model that hits well above its weight.

  • Spec sheet. 1.7 B-param decoder, 8 192-token context, trained on 11 T tokens (FineWeb-Edu, DCLM, The Stack + new math/code sets).

  • Strong small-model scores. Beats Qwen 2.5-1.5 B and Llama-1 B on HellaSwag 68.7, ARC 60.5, PIQA 77.6, etc.; instruction build pushes MT-Bench 6.13.

  • Light on hardware. BF16 weights use ~3.4 GB VRAM; 4-bit quant fits in ≈ 0.8 GB—runs on laptops, Pi-class edge boxes, or cheap A10G cards.

  • Edge-friendly features. Good at instruction following, text rewrite, summarization, function calling; English-first but solid general knowledge and basic math.

  • Built to run seamlessly within the Norman ecosystem. Get started by loading SmolLM2-1.7B directly in the Norman app.

Why pick it for Norman AI?

SmolLM 2 1.7 B delivers 8 K context and benchmark-beating accuracy inside a 4 GB envelope, under a permissive license. Perfect for edge deployments, “budget” inference tiers, or rapid fine-tune experiments without touching our heavier Qwen/Llama stacks.


messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
    {"role": "assistant",
     "content": "Sure! Here are some ways to eat bananas and dragonfruits together"},
    {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
]

response = await norman.invoke(
    {
        "model_name": "qwen3-4b",
        "inputs": [
            {
                "display_title": "Prompt",
                "data": messages
            }
        ]
    }
)