Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

MonsterMMORPG 
posted an update 2 days ago
view post
Post
3231
Wan 2.2 Complete Training Tutorial - Text to Image, Text to Video, Image to Video, Windows & Cloud : https://youtu.be/ocEkhAsPOs4

Wan 2.2 training is now so easy. I have done over 64 different unique Wan 2.2 trainings to prepare the very best working training configurations for you. The configurations are fully working locally with as low as 6 GB GPUs. So you will be able to train your awesome Wan 2.2 image or video generation LoRAs on your Windows computer with easiness. Moreover, I have shown how to train on cloud platforms RunPod and Massed Compute so even if you have no GPU or you want faster training, you can train on cloud for very cheap prices fully privately.

Full step by step tutorial : https://youtu.be/ocEkhAsPOs4

⏱️ Video Chapters:

0:00 Introduction to Wan 2.2 Training & Capabilities
0:56 Installing & Updating Musubi Tuner Locally
2:20 Explanation of Optimized Presets & Research Logic
4:00 Differences Between T2I, T2V, and I2V Configs
5:36 Extracting Files & Running Update Batch File
6:14 Downloading Wan 2.2 Training Models via Script
7:30 Loading Configs: Selecting GPU & VRAM Options
9:33 Using nvitop to Monitor RAM & VRAM Usage
10:28 Preparing Image Dataset & Trigger Words
11:17 Generating Dataset Config & Resolution Logic
12:55 Calculating Epochs & Checkpoint Save Frequency
13:40 Troubleshooting: Fixing Missing VAE Path Error
15:12 VRAM Cache Behavior & Training Speed Analysis
15:51 Trade-offs: Learning Rate vs Resolution vs Epochs
16:29 Installing SwarmUI & Updating ComfyUI Backend
18:13 Importing Latest Presets into SwarmUI
19:25 Downloading Inference Models via Script
20:33 Generating Images with Trained Low Noise LoRA
22:22 Upscaling Workflow for High-Fidelity Results
24:15 Increasing Base Resolution to 1280x1280
27:26 Text-to-Video Generation with Lightning LoRA
30:12 Image-to-Video Generation Workflow & Settings
31:35 Restarting Backend to Clear VRAM for Model Switching
33:45 Fixing RAM Crashes with Cache-None Argument
....
  • 3 replies
·
John1604 
posted an update 3 days ago
view post
Post
2997
我即将达到公共存储空间上限。我发现我的仓库 John1604/Kimi-K2-Thinking-q6K-gguf 没有获得足够的下载量,几乎占用了 1T 存储空间。尽管我喜爱 Kimi K2 的思考方式,但可能不得不删除这个模型。因为它是一个真正的开源 1T LLM,与任何前沿的 LLM 模型相媲美。在 AI 竞争中,美国有四家公司拥有1T+模型:xAI, OpenAI, 谷歌和Anthropologie。中国也有四家公司拥有1T+模型:阿里巴巴, Kimi, DeepSeek和GLM。目前双方势均力敌。

I'm about to reach my public storage limit. I've discovered that my repository John1604/Kimi-K2-Thinking-q6K-gguf isn't getting enough downloads and is nearly consuming 1TB of storage. While I love Kimi K2's way of thinking, I have to delete this model because it's a true open-source 1TB LLM, comparable to any cutting-edge LLM model. In the AI ​​race, four US companies have 1TB+ models: xAI, OpenAI, Google, and Anthropic. China also has four companies with 1TB+ models: Alibaba, Kimi, DeepSeek, and GLM. Currently, the two sides are evenly matched. Only American team and Chinese team have LLM with 1T+ parameters. Let's cheer for them to reach AGI in next 5 to 10 years. Maybe a 64T chinese model will do it -- Human and cat brain neuron difference is the model size of 64:1.
·
dhruv3006 
posted an update 1 day ago
view post
Post
2330
Editor-Neutral, Tool-Neutral API Artifacts

One thing we hear from developers : API docs and files often get stuck inside specific editors or tools. That friction slows teams down especially when people use different setups.

At Voiden, we believe your API artifacts should work anywhere your team does. Our files open seamlessly in VS Code, GitHub, custom Electron clients, and more, without being locked into a specific workspace or tool.

Key benefits:

-Portability across editors, repos, platforms, and team setups
-No proprietary workspace: the repo is the workspace
-Easy integration with CI pipelines, linters, and future tools
-Future-proof your API workflows with open and flexible artifacts

Voiden empowers devs and teams collaborate and stay agile as tools and platforms evolve.

Check out whats different about Voiden here: https://voiden.md/features


telcom 
posted an update 2 days ago
view post
Post
1716
arXiv CS endorsement

It's Javad, my Google Scholar Profile:
https://scholar.google.com/citations?user=bja6GwoAAAAJ&hl=en
I would like to share my articles with you on Hugging Face, I'm asking for endorsement* in Computer Science arxiv.org.

If you would like to endorse me, please visit the following URL:
https://arxiv.org/auth/endorse?x=NVUAPL
If that URL does not work for you, please visit
http://arxiv.org/auth/endorse.php
and enter the following six-digit alphanumeric string:
Endorsement Code: NVUAPL

Thanks you in advance.
Javad Taghia

* Who is qualified to endorse?

To endorse another user to submit to the cs.AI (Artificial Intelligence) subject class, an arXiv submitter must have submitted 3 papers to any of cs.AI, cs.AR, cs.CC, cs.CE, cs.CG, cs.CL, cs.CR, cs.CV, cs.CY, cs.DB, cs.DC, cs.DL, cs.DM, cs.DS, cs.ET, cs.FL, cs.GL, cs.GR, cs.GT, cs.HC, cs.IR, cs.IT, cs.LG, cs.LO, cs.MA, cs.MM, cs.MS, cs.NA, cs.NE, cs.NI, cs.OH, cs.OS, cs.PF, cs.PL, cs.RO, cs.SC, cs.SD, cs.SE, cs.SI or cs.SY earlier than three months ago and less than five years ago.

davidmezzetti 
posted an update about 24 hours ago
view post
Post
1158
🧬⚕️🔬 Encoding the World's Medical Knowledge into 970K! We're excited to release this new series of vector embeddings models for medical literature based on our recent BERT Hash work.

And you read it right, we're talking 970,000 parameters for a surprisingly strong performing model. Enjoy!

https://huggingface.co/blog/neuml/biomedbert-hash-nano
Parveshiiii 
posted an update 1 day ago
view post
Post
1942
Hey everyone!
We’re excited to introduce our new Telegram group: https://t.me/XenArcAI

This space is built for **model builders, tech enthusiasts, and developers** who want to learn, share, and grow together. Whether you’re just starting out or already deep into AI/ML, you’ll find a supportive community ready to help with knowledge, ideas, and collaboration.

💡 Join us to:
- Connect with fellow developers and AI enthusiasts
- Share your projects, insights, and questions
- Learn from others and contribute to a growing knowledge base

👉 If you’re interested, hop in and be part of the conversation: https://t.me/XenArcAI
·
Kseniase 
posted an update 2 days ago
view post
Post
3445
From Prompt Engineering to Context Engineering: Main Design Patterns

Earlier on, we relied on clever prompt wording, but now structured, complete context matters more than just magic phrasing. The next year is going to be a year of context engineering which expands beyond prompt engineering. The two complement each other: prompt engineering shapes how we ask, while context engineering shapes what the model knows, sees, and can do.

To keep things clear, here are the main techniques and design patterns in both areas, with some useful resources for further exploration:

▪️ 9 Prompt Engineering Techniques (configuring input text)

1. Zero-shot prompting – giving a single instruction without examples. Relies entirely on pretrained knowledge.

2. Few-shot prompting – adding input–output examples to encourage model to show the desired behavior. ⟶ https://arxiv.org/abs/2005.14165

3. Role prompting – assigning a persona or role (e.g. "You are a senior researcher," "Say it as a specialist in healthcare") to shape style and reasoning. ⟶ https://arxiv.org/abs/2403.02756

4. Instruction-based prompting – explicit constraints or guidance, like "think step by step," "use bullet points," "answer in 10 words"

5. Chain-of-Thought (CoT) – encouraging intermediate reasoning traces to improve multi-step reasoning. It can be explicit ("let’s think step by step"), or implicit (demonstrated via examples). ⟶ https://arxiv.org/abs/2201.11903

6. Tree-of-Thought (ToT) – the model explores multiple reasoning paths in parallel, like branches of a tree, instead of following a single chain of thought. ⟶ https://arxiv.org/pdf/2203.11171

7. Reasoning–action prompting (ReAct-style) – prompting the model to interleave reasoning steps with explicit actions and observations. It defines action slots and lets the model generate a sequence of "Thought → Action → Observation" steps. ⟶ https://arxiv.org/abs/2210.03629

Read further ⬇️
Also subscribe to Turing Post: https://www.turingpost.com/subscribe
·
DawnC 
posted an update 2 days ago
view post
Post
3703
PawMatchAI — Smarter, Safer, and More Thoughtful Recommendations 🐕✨

🐾 Recommendation system update — deeper reasoning, safer decisions
Over the past weeks, user feedback led me to rethink how PawMatchAI handles description-based breed recommendations. Instead of only matching surface-level preferences, the system now implements a multi-dimensional semantic reasoning architecture that emphasizes real-life compatibility and risk awareness.

Key technical improvements:
- SBERT-powered semantic understanding with dynamic weight allocation across six constraint dimensions (space, activity, noise, grooming, experience, family)

- Hierarchical constraint management distinguishing critical safety constraints from flexible preferences, with progressive relaxation when needed

-Multi-head scoring system combining semantic matching (15%), lifestyle compatibility (70%), constraint adherence (10%), and confidence calibration (5%)

-Intelligent risk filtering that applies graduated penalties (-10% to -40%) for genuine incompatibilities while preserving user choice

The goal: 👉 Not just dogs that sound good on paper, but breeds people will actually thrive with long-term.

What's improved?
- 🎯 Clearer separation of must-have safety constraints versus flexible preferences
- 🧠 Bidirectional semantic matching evaluating compatibility from both user and breed perspectives
- 🔍 Context-aware prioritization where critical factors (safety, space, noise) automatically receive higher weighting

What's next?
- 🐕 Expanding behavioral and temperament analysis dimensions
- 🐾 Extension to additional species with transfer learning
- 📱 Mobile-optimized deployment for easier access
- 🧩 Enhanced explainability showing why specific breeds are recommended

👉 Try PawMatchAI: DawnC/PawMatchAI

#AIProduct #SBERT #RecommendationSystems #DeepLearning #MachineLearning #NLP
  • 2 replies
·
Jiaqi-hkust 
posted an update about 14 hours ago
view post
Post
1051
We have open-sourced Robust-R1 (AAAI 2026 Oral), a new paradigm in the field of anti-degradation and robustness enhancement for multimodal large models.

Multimodal Large Language Models struggle to maintain reliable performance under extreme real-world visual degradations, which impede their practical robustness. Existing robust MLLMs predominantly rely on implicit training/adaptation that focuses solely on visual encoder generalization, suffering from limited interpretability and isolated optimization. To overcome these limitations, we propose Robust-R1, a novel framework that explicitly models visual degradations through structured reasoning chains. Our approach integrates: (i) supervised fine-tuning for degradation-aware reasoning foundations, (ii) reward-driven alignment for accurately perceiving degradation parameters, and (iii) dynamic reasoning depth scaling adapted to degradation intensity. To facilitate this approach, we introduce a specialized 11K dataset featuring realistic degradations synthesized across four critical real-world visual processing stages, each annotated with structured chains connecting degradation parameters, perceptual influence, pristine semantic reasoning chain, and conclusion. Comprehensive evaluations demonstrate state-of-the-art robustness: Robust-R1 outperforms all general and robust baselines on the real-world degradation benchmark R-Bench, while maintaining superior anti-degradation performance under multi-intensity adversarial degradations on MMMB, MMStar, and RealWorldQA.

We have made all of our papers, codes, data, model weights and demos fully open-source:
Paper: Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding (2512.17532) (help us to upvote)
GitHub code: https://github.com/jqtangust/Robust-R1 (help us to star)
HF model: https://huggingface.co/Jiaqi-hkust/Robust-R1
HF data: Jiaqi-hkust/Robust-R1
HF Space: Jiaqi-hkust/Robust-R1

We sincerely invite everyone to give it a try.

MikeDoes 
posted an update 1 day ago
view post
Post
954
How do you protect your prompts without breaking them? You need a smart sanitizer. A new system called Prϵϵmpt shows how.

The first, critical step in their solution is a high-performance Named Entity Recognition (NER) model to find the sensitive data. We're proud to see that these researchers, Amrita Roy Chowdhury, David Glukhov, Divyam Anshumaan, Prasad Chalasani, Nicolas Papernot, Somesh Jha, and Mihir Bellare from the University of Michigan, University of Toronto, University of Wisconsin-Madison, University of California, San Diego - Rady School of Management and Langroid Incorporated fine-tuned their NER model on 10 high-risk categories from the AI4Privacy dataset.

This is a perfect win-win. Our open-source data helps provide the foundation for the critical detection engine, which in turn enables the community to build and test better solutions like Prϵϵmpt's innovative use of encryption and Differential Privacy.

🔗 Check out their paper for a deep dive into a formally private, high-utility prompt sanitizer: https://arxiv.org/pdf/2504.05147

#OpenSource
#DataPrivacy
#LLM
#Anonymization
#AIsecurity
#HuggingFace
#Ai4Privacy
#Worldslargestopensourceprivacymaskingdataset