You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Preconditioned Optimizers for MoE Training at scale, with out-of-the-box support for MuP and FSDP support for Muon, built on top of Megatron-LM and TransformerEngine.
An open infrastructure to democratize and decentralize the development of superintelligence for humanity, now with support for heterogeneous devices training in tandem.
A repository with implementations of major papers on Gaussian Process regression models, implemented from scratch in Python, notably including Stochastic Variational Gaussian Processes.