The Qwen 3.6 family offers both dense and MoE architectures:
| Model | Type | Active Params | Best For |
|---|---|---|---|
| Qwen3.6-27B | Dense | 27B | Consistent high performance |
| Qwen3.6-35B-A3B | MoE | 3B of 35B | Ultra-efficient inference |
Key feature: Thinking/Non-Thinking modes — switch between deep reasoning and fast responses in a single model.
Apache 2.0, edge-optimized with native multimodality:
| Family | Creator | Standout Feature |
|---|---|---|
| Phi-4 | Microsoft | Small but mighty (14B rivals 70B models) |
| Command-R+ | Cohere | Optimized for RAG & enterprise search |
| Yi-Lightning | 01.AI | Chinese-English bilingual excellence |
docker run -d --gpus all -p 11434:11434 ollama/ollama && docker exec -it $(docker ps -q) ollama run gemma4:4b