Besides updates to our 14B, and 70B, we have a new LFM2-based 1.2B, Llama 3.2-based 3B, and Qwen 3-based 8B, all with class-leading Japanese language capabilities.
Per usual, lots of details in the Model Cards for those interested.
Today, we announce Mistral 3, the next generation of Mistral models. Mistral 3 includes three state-of-the-art small, dense models (14B, 8B, and 3B) and Mistral Large 3 – our most capable model to date – a sparse mixture-of-experts trained with 41B active and 675B total parameters.
All models are released under the Apache 2.0 license.
Today, we announce Mistral 3, the next generation of Mistral models. Mistral 3 includes three state-of-the-art small, dense models (14B, 8B, and 3B) and Mistral Large 3 – our most capable model to date – a sparse mixture-of-experts trained with 41B active and 675B total parameters.
All models are released under the Apache 2.0 license.
hey hey i'm just waiting for @blanchon and the liquid ai team to reach out to me , it's been two months with radio silence and so we're waiting on the "budget" to really try to start this project
🚀 We're excited to support the ERNIE AI Developer Challenge!
Fine-tune ERNIE with LLaMA-Factory and compete for $3,000 prizes by building the most impactful model — with submissions reviewed by the core developers of LLaMA-Factory.
Implemented a proof of concept sampler in pure PyTorch and transformers.
Max P consists of a dynamic token filter which applies Winsorization to cap the probabilties of top tokens. Specifically, a base probability in the range of [0,1] is used to cap individual token probability; the sampler then redistributes excess proportionally.
I’m just reading that Ryzen AI 395 has to be 30% slower than DGX Spark in LLM inferencing… and only 96GB GPU RAM… good I haven’t RTFM upfront, so I made the AMD faster with 128GB unified RAM 🫡 Z2 mini G1a can run Qwen3 Coder 30B BF16 at 26.8 tok/sec in ~60GB GPU RAM
🚀 New blog: Maintain the unmaintainable – 1M+ Python LOC, 400+ models
How do you stop a million-line library built by thousands of contributors from collapsing under its own weight? At 🤗 Transformers, we do it with explicit software-engineering tenets, principles that make the codebase hackable at scale.
🔍 Inside the post: – One Model, One File: readability first — you can still open a modeling file and see the full logic, top to bottom. – Modular Transformers: visible inheritance that cuts maintenance cost by ~15× while keeping models readable. – Config-Driven Performance: FlashAttention, tensor parallelism, and attention scheduling are config-level features, not rewrites.
Written with @lysandre,@pcuenq and @yonigozlan, this is a deep dive into how Transformers stays fast, open, and maintainable.
🖤 Probably one of my favorite projects that I've worked on so far, introducing Новояз (Novoyaz).
🛠 One of the first acts of the Bolshevik government after the Russian Revolution was the reform and standardization of the Russian language, which at the time had a non-standard and challenging orthography.
📚 Upon its reform the government launched a nationwide campaign called Ликбез (Likbez), which sought to improve literacy in the country (by the way, it worked, bringing the national literacy rate from <20% in the 1920s to >80% by the 1930s).
‼ While this is a remarkable result that should absolutely be celebrated, it's one that has left behind literally hundreds of thousands if not millions of artifacts using pre-reform Russian orthography.
😓 Researchers and historians are working tirelessly to translate these artifacts to modern Russian so that they may be archived and studied but many have told me that. they are doing this BY HAND (!).
💡 I thought, well this is a perfect use case for OCR and a fine-tuned LLM to step in and help to aid in this important work!
🎮 Live Model Demo: Upload an Android Screenshot and instructions to see the model in action ! Tonic/l-operator-demo
Built in a garage, funded by pre-orders, no VC. Now we’re scaling to 1 k installer units.
We’re giving 50 limited-edition prototypes to investors , installers & researchers who want to co-design the sovereign smart home.
👇 Drop “EUSKERA” in the comments if you want an invite, tag a friend who still thinks Alexa is “convenient,” and smash ♥️ if AI should belong to people - not servers.
Tremendous quality of life upgrade on the Hugging Face Hub - we now have auto-complete emojis 🤗 🥳 👏 🙌 🎉
Get ready for lots more very serious analysis on a whole range of topics from yours truly now that we have unlocked this full range of expression 😄 🤔 🗣 🙊