merve's picture

Building on HF

merve PRO

merve

huggingface

·

https://github.com/merveenoyan/smol-vision

AI & ML interests

I love this website VLMs, vision & co

Recent Activity

liked a model 9 days ago

nvidia/magpie_tts_multilingual_357m

liked a model 9 days ago

zai-org/GLM-4.7

updated a dataset 9 days ago

merve/personal-website

View all activity

Organizations

published an article 14 days ago

Article

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

+4

14 days ago

•

89

published an article 2 months ago

Article

Streaming datasets: 100x More Efficient

+3

Oct 27

•

75

published an article 2 months ago

Article

Supercharge your OCR Pipelines with Open Models

+5

Oct 21

•

285

published an article 3 months ago

Article

Smol2Operator: Post-Training GUI Agents for Computer Use

+3

Sep 23

•

134

published an article 5 months ago

Article

Vision Language Model Alignment in TRL ⚡️

+3

Aug 7

•

104

published an article 6 months ago

Article

Introducing ColQwen-Omni: Retrieve in every modality

Jul 17

•

75

published an article 7 months ago

Article

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware

+3

Jun 19

•

95

published an article 7 months ago

Article

Learn the Hugging Face Kernel Hub in 5 Minutes

+5

Jun 12

•

151

published an article 7 months ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

+7

Jun 3

•

302

published an article 7 months ago

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

+5

May 21

•

247

published an article 8 months ago

Article

Vision Language Models (Better, faster, stronger)

+3

May 12

•

578

published an article 8 months ago

Article

Vision Language Models (Better, faster, stronger)

+3

May 12

•

578

published an article 8 months ago

Article

Welcoming Llama Guard 4 on Hugging Face Hub

+2

Apr 29

•

40

published an article 9 months ago

Article

Cohere on Hugging Face Inference Providers 🔥

+5

Apr 16

•

129

published an article 10 months ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

+2

Mar 12

•

480

published an article 10 months ago

Article

SigLIP 2: A better multilingual vision language encoder

+1

Feb 21

•

193

published an article 10 months ago

Article

SigLIP 2: A better multilingual vision language encoder

+1

Feb 21

•

193

published an article 10 months ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

+5

Feb 20

•

320

published an article 11 months ago

Article

PaliGemma 2 Mix - New Instruction Vision Language Models by Google

+1

Feb 19

•

74