All the articles with the tag "others".
This blog recounts my testing on the impact of modern LLM optimizations on ViT-based models, such as SwiGLU, removing bias, and the Muon optimizer.
February 19, 2026