Dokyoon
leeloolee
AI & ML interests
ai
Recent Activity
upvoted
an
article
1 day ago
Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective
reacted
to
Parveshiiii's
post
with š„
4 days ago
š Wanna train your own AI Model or Tokenizer from scratch?
Building models isnāt just for big labs anymore ā with the right data, compute, and workflow, you can create **custom AI models** and **tokenizers** tailored to any domain. Whether itās NLP, domaināspecific datasets, or experimental architectures, training from scratch gives you full control over vocabulary, embeddings, and performance.
⨠Why train your own?
- Full control over vocabulary & tokenization
- Domaināspecific optimization (medical, legal, technical, etc.)
- Better performance on niche datasets
- Freedom to experiment with architectures
ā” The best part?
- Tokenizer training (TikToken / BPE) can be done in **just 3 lines of code**.
- Model training runs smoothly on **Google Colab notebooks** ā no expensive hardware required.
š Try out my work:
- š https://github.com/OE-Void/Tokenizer-from_scratch
- š https://github.com/OE-Void/GPT