GPU Resurrection
Everyone said the Radeon Pro 575 was dead for LLMs. Metal crashes on discrete AMD GPUs. ROCm is Linux-only. Ollama can't see it. So I compiled llama.cpp with the LunarG Vulkan SDK — MoltenVK, the layer nobody tests — and the GPU appeared. E4B went from 7.3 tok/s (CPU) to 37.6 tok/s on Vulkan. 5.1× speedup from hardware everyone wrote off.