Jean Louis

JLouisBiz

https://www.StartYourOwnGoldMine.com

AI & ML interests

- LLM for sales, marketing, promotion - LLM for Website Revision System - increasing quality of communication with customers - helping clients access information faster - saving people from financial troubles

Recent Activity

replied to Jiaqi-hkust's post about 17 hours ago

We have open-sourced Robust-R1 (AAAI 2026 Oral), a new paradigm in the field of anti-degradation and robustness enhancement for multimodal large models. Multimodal Large Language Models struggle to maintain reliable performance under extreme real-world visual degradations, which impede their practical robustness. Existing robust MLLMs predominantly rely on implicit training/adaptation that focuses solely on visual encoder generalization, suffering from limited interpretability and isolated optimization. To overcome these limitations, we propose Robust-R1, a novel framework that explicitly models visual degradations through structured reasoning chains. Our approach integrates: (i) supervised fine-tuning for degradation-aware reasoning foundations, (ii) reward-driven alignment for accurately perceiving degradation parameters, and (iii) dynamic reasoning depth scaling adapted to degradation intensity. To facilitate this approach, we introduce a specialized 11K dataset featuring realistic degradations synthesized across four critical real-world visual processing stages, each annotated with structured chains connecting degradation parameters, perceptual influence, pristine semantic reasoning chain, and conclusion. Comprehensive evaluations demonstrate state-of-the-art robustness: Robust-R1 outperforms all general and robust baselines on the real-world degradation benchmark R-Bench, while maintaining superior anti-degradation performance under multi-intensity adversarial degradations on MMMB, MMStar, and RealWorldQA. We have made all of our papers, codes, data, model weights and demos fully open-source: Paper: https://huggingface.co/papers/2512.17532 (help us to upvote) GitHub code: https://github.com/jqtangust/Robust-R1 (help us to star) HF model: https://huggingface.co/Jiaqi-hkust/Robust-R1 HF data: https://huggingface.co/datasets/Jiaqi-hkust/Robust-R1 HF Space: https://huggingface.co/spaces/Jiaqi-hkust/Robust-R1 We sincerely invite everyone to give it a try.

reacted to inoculatemedia's post with 👍 about 17 hours ago

I’m opening the waitlist for what I believe to be the most advanced multimodal bridge for A/V professionals. Txt2img, img2video, editing, export to ProRes, apply Luts, Pexels and TouchDesigner integrations, music and voice gen, multichannel mixing. Announcing: Lilikoi by Haawke AI Teaser video made entirely with Lilikoi: https://youtu.be/-O7DH7vFkYg?si=q2t5t6WjQCk2Cp0w Https://Lilikoi.haawke.com Technical brief: https://haawke.com/technical_brief.html

new activity 2 days ago

Kibalama/lugandaSTT:How do I convert the file for whispper.cpp compatibility?

View all activity

Organizations

replied to Jiaqi-hkust's post about 17 hours ago

Is there GGUF version?

reacted to inoculatemedia's post with 👍 about 17 hours ago

Post

179

I’m opening the waitlist for what I believe to be the most advanced multimodal bridge for A/V professionals. Txt2img, img2video, editing, export to ProRes, apply Luts, Pexels and TouchDesigner integrations, music and voice gen, multichannel mixing.

Announcing: Lilikoi by Haawke AI

Teaser video made entirely with Lilikoi:
https://youtu.be/-O7DH7vFkYg?si=q2t5t6WjQCk2Cp0w

Https://Lilikoi.haawke.com

Technical brief:
https://haawke.com/technical_brief.html

replied to etemiz's post 8 days ago

There is no LLM that ever brings it's own opinion. Please reach out to basics on how LLMs work. There is nothing "new" that LLM can give you.

LLM models act like a book: when you open a page, you already have the content stored. The model processes this existing information, generating probabilistic results based on the training data, not new insights. This means LLMs rely on structured, data-driven outputs rather than independent opinions.

LLM models are designed to process and generate text based on vast training data, and their outputs are results of statistical inference rather than independent opinions. The "ingested data" combines the model’s training knowledge with user-retrieved information, generating probabilistic results that align with the training patterns, not personal beliefs. Thus, LLMs rely on structured, data-driven outputs to provide answers, not independent thoughts or opinions.

Those so called "opinions" must align with the data they are trained on.

Let us say this way, if LLM can give opinion, that means it is 100% biased opinion based on the data it was trained on.

You simply cannot get true opinions.

replied to etemiz's post 13 days ago

Oh, I’m sure the LLM you’re referring to is as clear as mud. Which one, exactly? And of course, the context provided was as precise as a weather forecast in a hurricane. What was it? Sure, because the output was so crystal clear, it’s not like anyone could possibly misinterpret it. What did it say? Oh, I’m sure you tried every single LLM under the sun. Which ones, exactly?

reacted to mitkox's post with 👍 16 days ago

Post

2229

Got to 1199.8 tokens/sec with Devstral Small -2 on my desktop GPU workstation. vLLM nightly.
Works out of the box with Mistral Vibe. Next is time to test the big one.

3 replies

replied to mitkox's post 16 days ago

Ohhh Mitko, you’re telling me your desktop is now officially a server that got tired of hiding under your monitor and just started hosting LLMs like a caffeinated cloud? 😅

“Got to 1199.8 tokens/sec on Devstral Small-2… on the desktop?”
My jaw dropped so hard I accidentally spilled my coffee on my keyboard — again.
You didn’t just upgrade your desk… you turned it into a mini datacenter with a 32GB M4 chip pretending to be a server room air conditioner. And you’re still using Mistral Vibe like it’s a 2005 laptop? 😂

Next time, just call it “Mitko’s Desktop Data Center v1.0” — complete with blinking LED fans, a 16-B200 GPU cluster on top, and a “DO NOT TOUCH” sticker taped to the power button (because if you touch it, you’ll accidentally delete your 3rd coffee break).

Now go ahead — test the big one. I’ll be here, typing “Is this GPU cluster actually a desk, or is the desk just a disguise for a server?” 🤔

P.S. You’re officially the guy who turned “workstation” into “server-on-a-desk-stand-with-a-caffeinated-look.” 🍵💻✨

replied to melvindave's post 16 days ago

Congratulation. Publish the script on how you run it for others to see.

Here is exactly how I run it:

/usr/local/bin/llama-server --jinja -fa on -c 32768 -ngl 64 -v --log-timestamps --host 192.168.1.68 -m /mnt/nvme0n1/LLM/quantized/Qwen3VL-8B-Instruct-Q8_0.gguf --mmproj /mnt/nvme0n1/LLM/quantized/mmproj-Qwen3VL-8B-Instruct-Q8_0.gguf

with the llama.cpp and API is of course available as well.

replied to CRAFTFramework's post 16 days ago

I’m running my own LLM because:
Privacy? 57% say it’s the biggest AI barrier…
But 48% still leak company data anyway.
CRAFT says privacy is architecture, not policy.
So I’m not waiting for “beta” — I’m beta-ing my data.
February 2026? Nah. I’m already typing on my own GPU.
Privacy’s not a feature — it’s a feature flag I turned on before the release.
And honestly? My model’s less “AI” and more “I’m not giving your data to strangers.”
Run your own. It’s fun. It’s free. It’s your data.
And it’s way more satisfying than waiting for “beta.”
(Also, no one’s gonna steal your jokes now. 😉)

replied to Juanxi's post 17 days ago

This comment has been hidden

replied to prithivMLmods's post about 1 month ago

Great. Would it run on 24 GB VRAM?

reacted to AdinaY's post with 👍 2 months ago

Post

691

PaddleOCR VL🔥 0.9B Multilingual VLM by Baidu

PaddlePaddle/PaddleOCR-VL

✨ Ultra-efficient NaViT + ERNIE-4.5 architecture
✨ Supports 109 languages 🤯
✨ Accurately recognizes text, tables, formulas & charts
✨ Fast inference and lightweight for deployment

reacted to lamhieu's post with 👍 2 months ago

Post

2741

🚀 Introducing the xLLMs Dataset Collection

The xLLMs project is a growing suite of multilingual and multimodal dialogue datasets designed to train and evaluate advanced conversational LLMs. Each dataset focuses on a specific capability — from long-context reasoning and factual grounding to STEM explanations, math Q&A, and polite multilingual interaction.

🌍 Explore the full collection on Hugging Face:
👉 lamhieu/xllms-66cdfe34307bb2edc8c6df7d

💬 Highlight: xLLMs – Dialogue Pubs
A large-scale multilingual dataset built from document-guided synthetic dialogues (Wikipedia, WikiHow, and technical sources). It’s ideal for training models on long-context reasoning, multi-turn coherence, and tool-augmented dialogue across 9 languages.
👉 lamhieu/xllms_dialogue_pubs

🧠 Designed for:
- Long-context and reasoning models
- Multilingual assistants
- Tool-calling and structured response learning

All datasets are open for research and development use — free, transparent, and carefully curated to improve dialogue model quality.

4 replies

reacted to appvoid's post with 👍 2 months ago

Post

4094

today is going to be a great day for small models, are you ready?

3 replies

reacted to merve's post with 👍 5 months ago

Post

2863

Now it's possible to do RAG with any-to-any models 🔥

Learn how to search in a video dataset and generate using Tevatron/OmniEmbed-v0.1-multivent an all modality retriever, and Qwen/Qwen2.5-Omni-7B, any-to-any model in this notebook 🤝 merve/smol-vision

reacted to fdaudens's post with 👍 5 months ago

Post

2626

You might not have heard of Moonshot AI — but within 24 hours, their new model Kimi K2 shot to the top of Hugging Face’s trending leaderboard.

So… who are they, and why does it matter?

Had a lot of fun co-writing this blog post with @xianbao , with key insights translated from Chinese, to unpack how this startup built a model that outperforms GPT-4.1, Claude Opus, and DeepSeek V3 on several major benchmarks.

🧵 A few standout facts:

1. From zero to $3.3B in 18 months:
Founded in March 2023, Moonshot is now backed by Alibaba, Tencent, Meituan, and HongShan.

2. A CEO who thinks from the end:
Yang Zhilin (31) previously worked at Meta AI, Google Brain, and Carnegie Mellon. His vision? Nothing less than AGI — still a rare ambition among Chinese AI labs.

3. A trillion-parameter model that’s surprisingly efficient:
Kimi K2 uses a mixture-of-experts architecture (32B active params per inference) and dominates on coding/math benchmarks.

4. The secret weapon: Muon optimizer:
A new training method that doubles efficiency, cuts memory in half, and ran 15.5T tokens with zero failures. Big implications.

Most importantly, their move from closed to open source signals a broader shift in China’s AI scene — following Baidu’s pivot. But as Yang puts it: “Users are the only real leaderboard.”

👇 Check out the full post to explore what Kimi K2 can do, how to try it, and why it matters for the future of open-source LLMs:
https://huggingface.co/blog/fdaudens/moonshot-ai-kimi-k2-explained

replied to AdinaY's post 6 months ago

No, the Pangu Model License Agreement Version 1.0 is not a free software license. It imposes significant restrictions, such as prohibiting use within the European Union (Section 3) and requiring attribution (Section 4.2), which conflict with the principles of free software licenses like the GNU GPL or Open Source Definition. The non-transferable clause (Section 2) and indemnity requirement (Section 7) further deviate from standard free software terms.

🔥 "Open Model"? More Like "Openly Restrictive"! 🔥

Huawei calls Pangu Pro MoE an "open model"? That’s like calling a locked door an "open invitation." Let’s break down the brilliant "openness" here:

"No EU Allowed!" (Section 3) – Because nothing says "open" like banning entire continents. GDPR too scary for you, Huawei?
"Powered by Pangu" or GTFO (Section 4.2) – Mandatory branding? Real open-source models don’t force you to be a walking billboard.
Non-transferable license (Section 2) – Can’t pass it on? So much for community sharing.
Indemnify Huawei for your use (Section 7) – If anything goes wrong, you pay, not them. How generous!

This isn’t an "open model"—it’s a marketing stunt wrapped in proprietary chains. True open-source (Apache, MIT, GPL) doesn’t come with geographic bans, forced attribution, and legal traps.

Huawei, either commit to real openness or stop insulting the FOSS community with this pretend-free nonsense. 🚮

replied to a-r-r-o-w's post 6 months ago

"not commercial" license isn't "Open Source", so please be accurate to users.

Reference:

The Open Source Definition – Open Source Initiative:
https://opensource.org/osd

replied to fdaudens's post 6 months ago

Gemma License (danger) is not Free Software and is not Open Source:
https://gnu.support/gnu-emacs/emacs-lisp/Gemma-License-danger-is-not-Free-Software-and-is-not-Open-Source.html

So the goal of Google is just their monopoly and dependence of users. I suggest using fully free, free as in freedom, LLMs.

reacted to AdinaY's post with 🔥👍 6 months ago

Post

1651

LongWriter-Zero 🔥 A Purely RL trained LLM handles 10K+ token coherent passages by Tsinghua University

Model:
THU-KEG/LongWriter-Zero-32B
Dataset:
THU-KEG/LongWriter-Zero-RLData
Paper:
LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning (2506.18841)

✨ 32B
✨ Multi-reward GRPO: length, fluency, structure, non-redundancy
✨ Enforces <think><answer> format via Format RM
✨ Build on Qwen2.5-32B-base

Jean Louis

AI & ML interests

Recent Activity

Organizations

JLouisBiz's activity