General
Recommended distributions
Key setup steps
1. Install NVIDIA drivers and CUDA toolkit or AMD ROCm stack
2. Install Docker and the NVIDIA Container Toolkit if using GPU containers
3. Use Ollama, llama.cpp, vLLM, or TGI for serving
4. Configure systemd services for persistent local endpoints
Why Linux?
Best driver support, lowest overhead, widest range of inference engines, and straightforward headless server deployment.
