Hugging Face is the GitHub of machine learning — hosting 2M+ models, 500K datasets, and 1M Spaces.
| Component | Purpose |
|---|---|
| Model Hub | Discover, download, and share model weights |
| Datasets | Pre-processed training and evaluation datasets |
| Spaces | Deploy Gradio/Streamlit demos with free GPUs (ZeroGPU) |
| Transformers v5 | PyTorch-first library for loading & running models |
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"mistralai/Mistral-Small-4",
torch_dtype="auto",
device_map="auto" # Automatic GPU/CPU distribution
)
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-Small-4")
inputs = tokenizer("Explain quantum computing", return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(output[0], skip_special_tokens=True))
docker run --gpus all -p 8080:80 -v ./data:/data ghcr.io/huggingface/text-generation-inference:latest --model-id mistralai/Mistral-Small-4