[ ABORT TO HUD ]
SEQ. 1

The Global Model Families

🌏 Qwen, Gemma & Others10 min100 BASE XP

Beyond Llama & Mistral

Qwen (Alibaba)

The Qwen 3.6 family offers both dense and MoE architectures:

ModelTypeActive ParamsBest For
Qwen3.6-27BDense27BConsistent high performance
Qwen3.6-35B-A3BMoE3B of 35BUltra-efficient inference

Key feature: Thinking/Non-Thinking modes — switch between deep reasoning and fast responses in a single model.

Google Gemma 4

Apache 2.0, edge-optimized with native multimodality:

  • Gemma 4 E2B: ~2.3B params, smartphones & IoT
  • Gemma 4 E4B: ~4.5B params, flagship mobile devices
  • 128K context window, 2-bit/4-bit quantization support
  • Runs on Android, iOS, Raspberry Pi, and in-browser via WebGPU

Other Notable Families

FamilyCreatorStandout Feature
Phi-4MicrosoftSmall but mighty (14B rivals 70B models)
Command-R+CohereOptimized for RAG & enterprise search
Yi-Lightning01.AIChinese-English bilingual excellence
🐳 Edge Container: Run Gemma 4 in Docker with Ollama:
docker run -d --gpus all -p 11434:11434 ollama/ollama && docker exec -it $(docker ps -q) ollama run gemma4:4b
KNOWLEDGE CHECK
QUERY 1 // 2
What is unique about Qwen 3.6's Thinking/Non-Thinking modes?
Different model files
Toggleable deep reasoning vs fast responses in one model
Only works in Chinese
Requires special hardware
Watch: 139x Rust Speedup
The Global Model Families | Qwen, Gemma & Others — Open Source AI Academy