2 83 244

kelechic

tensorkelechi

https://kelechi-c.github.io/

AI & ML interests

vision

Recent Activity

upvoted an article 24 days ago

Continuous batching from first principles

liked a model about 2 months ago

Qwen/Qwen2-Audio-7B-Instruct

liked a model 2 months ago

openai/whisper-tiny

View all activity

Organizations

upvoted an article 24 days ago

Article

Continuous batching from first principles

Nov 25, 2025

•

291

upvoted a paper 9 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7, 2025 • 202

upvoted 2 papers 10 months ago

Neural Vocoder is All You Need for Speech Super-resolution

Paper • 2203.14941 • Published Mar 28, 2022 • 1

MusicInfuser: Making Video Diffusion Listen and Dance

Paper • 2503.14505 • Published Mar 18, 2025 • 11

upvoted 2 articles 10 months ago

Article

Open-Source Handwritten Signature Detection Model

Mar 14, 2025

•

120

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Mar 12, 2025

•

480

upvoted a paper 10 months ago

Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

Paper • 2503.03983 • Published Mar 6, 2025 • 26

upvoted an article 10 months ago

Article

Using LoRA for Efficient Stable Diffusion Fine-Tuning

Jan 26, 2023

•

upvoted a collection 11 months ago

Qwen2.5

Collection

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated 3 days ago • 672

upvoted a paper 11 months ago

SoundStorm: Efficient Parallel Audio Generation

Paper • 2305.09636 • Published May 16, 2023 • 13

upvoted a collection 11 months ago

CLAP: Contrastive Language-Audio Pretraining

Collection

CLAP is to audio what CLIP is to image. • 5 items • Updated Oct 20, 2025 • 15

upvoted an article 11 months ago

Article

Design choices for Vision Language Models in 2024

Apr 16, 2024

•

upvoted a paper 11 months ago

Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities

Paper • 2402.01831 • Published Feb 2, 2024 • 16

upvoted 2 articles 11 months ago

Article

SmolVLM - small yet mighty Vision Language Model

Nov 26, 2024

•

397

Article

SmolVLM Grows Smaller – Introducing the 256M & 500M Models!

Jan 23, 2025

•

189

upvoted a paper 11 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4, 2025 • 252

upvoted an article 11 months ago

Article

State of open video generation models in Diffusers

Jan 27, 2025

•

upvoted an article 12 months ago

Article

Upgrading Kokoro: natural TTS for short bursts

Nov 22, 2024

•

upvoted a paper 12 months ago

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 121

upvoted a collection 12 months ago

Cosmos-Tokenizer

Collection

A suite of image and video tokenizers • 13 items • Updated 11 days ago • 43

kelechic

AI & ML interests

Recent Activity

Organizations

tensorkelechi's activity

Continuous batching from first principles

Open-Source Handwritten Signature Detection Model

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Using LoRA for Efficient Stable Diffusion Fine-Tuning

Design choices for Vision Language Models in 2024

SmolVLM - small yet mighty Vision Language Model

SmolVLM Grows Smaller – Introducing the 256M & 500M Models!

State of open video generation models in Diffusers

Upgrading Kokoro: natural TTS for short bursts