Let's Talk about AI

#1
by kalashshah19 - opened
Indian AI Developers org
edited Aug 22

Hello, here is an open space for everyone to talk, share, ask and show anything about AI.

kalashshah19 pinned discussion
Indian AI Developers org

Has anyone pre-trained LLM model from scratch ? If yes then share your experience, things to consider while training, notes, tips etc.

Indian AI Developers org

Hi i am also intrested into LLM Model , i am about to start this reserach from next week please give any inputs

Indian AI Developers org

Hi i am also intrested into LLM Model , i am about to start this reserach from next week please give any inputs

Hey @Shashank2k3 , if you want your own LLM model, first you need huge data. You can start with fine tuning already available good LLM models like Gemma, Phi, LLAMA, mistral etc with your dataset. Start with small models of sizes like 4 to 7B parameters. For pre-training LLM from scratch you need enormous data, good resources like heavy duty GPUs and CPUs and also have knowledge of training techniques, NLP, etc . You can always brainstorm with ChatGPT to get more knowledge.

Indian AI Developers org

Hey @kalashshah19 , thanks for the input! I already have a solid foundation in these areas from my Bachelor's degree in AIML, and now I’m looking to dive deeper into the world of LLMs.

Indian AI Developers org

Hey @kalashshah19 , thanks for the input! I already have a solid foundation in these areas from my Bachelor's degree in AIML, and now I’m looking to dive deeper into the world of LLMs.

Great !

Indian AI Developers org

Yupp so what you guys do, i mean profession!!!

Indian AI Developers org

Yupp so what you guys do, i mean profession!!!

I am an Associate Data Scientist at Casepoint.
What about you ?

Indian AI Developers org

Is it fine tuned on question papers only or all content of NEET like books, PDFs, etc ?

Indian AI Developers org

it is fine tuned on a dataset consisting ~800 questions including practice questions and pyqs

Indian AI Developers org

Nice !

Indian AI Developers org

I'm pleased to share that after putting in a lot of efforts and hard work, I have curated the first high quality and clean audio dataset of Shrimad Bhagavad Gita.

I hope this dataset proves to be helpful to all🌸

Link to Dataset : https://huggingface.co/datasets/JDhruv14/Bhagavad-Gita_Audio

ॐ नमो भगवते वासुदेवाय 🙏🙏

Indian AI Developers org

@JDhruv14 amazing brother, keep it up

Indian AI Developers org

I'm pleased to share that after putting in a lot of efforts and hard work, I have curated the first high quality and clean audio dataset of Shrimad Bhagavad Gita.

I hope this dataset proves to be helpful to all🌸

Link to Dataset : https://huggingface.co/datasets/JDhruv14/Bhagavad-Gita_Audio

ॐ नमो भगवते वासुदेवाय 🙏🙏

Great man, will check it out.

Indian AI Developers org

tried many arenas and LLM battles but couldn't find the best LLM for Indian use cases? try Indic LLM Arena by AI4Bharat (IIT Madras) to find the most suitable LLM for Indian use cases.
link: https://arena.ai4bharat.org/#/chat

image

Indian AI Developers org

Will try !

Sign up or log in to comment