BLaST: High Performance Inference and Pretraining using BLock Sparse Transformers Paper • 2507.03117 • Published Jul 3, 2025
Virgo: A Preliminary Exploration on Reproducing o1-like MLLM Paper • 2501.01904 • Published Jan 3, 2025 • 33
BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline Paper • 2408.15079 • Published Aug 27, 2024 • 54