BERT Hash Nano Models Set of BERT models with a modified embeddings layer NeuML/bert-hash-femto Updated Oct 9 • 17 • 11 NeuML/bert-hash-pico Updated Oct 9 • 28 • 3 NeuML/bert-hash-nano Updated Oct 9 • 71 • 14 NeuML/biomedbert-hash-nano Updated 2 days ago • 19 • 2
Embeddings databases for txtai Add knowledge to your txtai agents and processes. NeuML/txtai-wikipedia Sentence Similarity • Updated Nov 17 • 176 • 75 NeuML/txtai-wikipedia-slim Sentence Similarity • Updated Nov 17 • 46 • 5 NeuML/txtai-arxiv Sentence Similarity • Updated Nov 17 • 39 • 20 NeuML/txtai-hfposts Sentence Similarity • Updated Nov 23, 2024 • 19 • 3
Medical and Scientific Literature Datasets Datasets with medical and scientific literature. NeuML/pubmed-h5n1 Viewer • Updated Jun 23 • 7.87k • 45 • 2 NeuML/pubmed-hmpv Viewer • Updated Jun 23 • 1.95k • 46 • 2
Medical Embeddings M2V Models distilled with Model2Vec - 100K / 500K / 1M / 2M / 8M parameter variants. NeuML/pubmedbert-base-embeddings-8M Sentence Similarity • Updated Jun 26 • 7.78k • 9 NeuML/pubmedbert-base-embeddings-2M Sentence Similarity • Updated Jun 26 • 37 • 3 NeuML/pubmedbert-base-embeddings-1M Sentence Similarity • Updated Jun 26 • 43 • 2 NeuML/pubmedbert-base-embeddings-500K Sentence Similarity • Updated Jun 26 • 56 • 2
Wikipedia Embeddings indexes and datasets for Wikipedia data. NeuML/wikipedia Updated Jan 11, 2024 • 59 • 5 NeuML/wikipedia-20250620 Viewer • Updated Jul 3 • 4.81M • 186 • 3 NeuML/txtai-wikipedia Sentence Similarity • Updated Nov 17 • 176 • 75 NeuML/txtai-wikipedia-slim Sentence Similarity • Updated Nov 17 • 46 • 5
ColBERT Late interaction models NeuML/colbert-muvera-femto Sentence Similarity • 243k • Updated 12 days ago • 76 • 20 NeuML/colbert-muvera-pico Sentence Similarity • 448k • Updated 12 days ago • 23 • 1 NeuML/colbert-muvera-nano Sentence Similarity • 970k • Updated 12 days ago • 138 • 1 NeuML/colbert-muvera-micro Sentence Similarity • 4.39M • Updated 12 days ago • 352 • 25
Language Detection StaticVectors models to detect language. Exports of FastText that run in NumPy without needing FastText NeuML/language-id Text Classification • Updated Jan 26 • 127 • 6 NeuML/language-id-quantized Text Classification • Updated Jan 26 • 813 • 2
Medical and Scientific Literature Models Models for working with medical and scientific literature. NeuML/pubmedbert-base-embeddings Sentence Similarity • 0.1B • Updated Jun 26 • 163k • • 157 NeuML/pubmedbert-base-embeddings-matryoshka Sentence Similarity • 0.1B • Updated Jun 26 • 2.39k • • 23 NeuML/pubmedbert-base-embeddings-8M Sentence Similarity • Updated Jun 26 • 7.78k • 9 NeuML/pubmedbert-base-colbert Sentence Similarity • 0.1B • Updated 12 days ago • 206 • 6
NeuML/pubmedbert-base-embeddings-matryoshka Sentence Similarity • 0.1B • Updated Jun 26 • 2.39k • • 23
Text to Speech (TTS) Text to Speech (TTS) models compatible with txtai's TextToSpeech pipeline. NeuML/kokoro-base-onnx Text-to-Speech • Updated Mar 21 • 3 NeuML/kokoro-fp16-onnx Text-to-Speech • Updated Mar 21 • 4 NeuML/kokoro-int8-onnx Text-to-Speech • Updated Mar 21 • 10 NeuML/ljspeech-jets-onnx Text-to-Speech • Updated Oct 10, 2024 • 1.02k • 25
Word Vectors Legacy word vectors (FastText, GloVe, Word2Vec) stored in the StaticVectors format NeuML/fasttext Sentence Similarity • Updated Jan 26 • 353 • 1 NeuML/fasttext-quantized Sentence Similarity • Updated Jan 26 • 16 • 2 NeuML/glove-6B Sentence Similarity • Updated Jan 26 • 213 • 3 NeuML/glove-6B-quantized Sentence Similarity • Updated Jan 26 • 3.77k • 3
BERT Hash Nano Models Set of BERT models with a modified embeddings layer NeuML/bert-hash-femto Updated Oct 9 • 17 • 11 NeuML/bert-hash-pico Updated Oct 9 • 28 • 3 NeuML/bert-hash-nano Updated Oct 9 • 71 • 14 NeuML/biomedbert-hash-nano Updated 2 days ago • 19 • 2
ColBERT Late interaction models NeuML/colbert-muvera-femto Sentence Similarity • 243k • Updated 12 days ago • 76 • 20 NeuML/colbert-muvera-pico Sentence Similarity • 448k • Updated 12 days ago • 23 • 1 NeuML/colbert-muvera-nano Sentence Similarity • 970k • Updated 12 days ago • 138 • 1 NeuML/colbert-muvera-micro Sentence Similarity • 4.39M • Updated 12 days ago • 352 • 25
Embeddings databases for txtai Add knowledge to your txtai agents and processes. NeuML/txtai-wikipedia Sentence Similarity • Updated Nov 17 • 176 • 75 NeuML/txtai-wikipedia-slim Sentence Similarity • Updated Nov 17 • 46 • 5 NeuML/txtai-arxiv Sentence Similarity • Updated Nov 17 • 39 • 20 NeuML/txtai-hfposts Sentence Similarity • Updated Nov 23, 2024 • 19 • 3
Language Detection StaticVectors models to detect language. Exports of FastText that run in NumPy without needing FastText NeuML/language-id Text Classification • Updated Jan 26 • 127 • 6 NeuML/language-id-quantized Text Classification • Updated Jan 26 • 813 • 2
Medical and Scientific Literature Datasets Datasets with medical and scientific literature. NeuML/pubmed-h5n1 Viewer • Updated Jun 23 • 7.87k • 45 • 2 NeuML/pubmed-hmpv Viewer • Updated Jun 23 • 1.95k • 46 • 2
Medical and Scientific Literature Models Models for working with medical and scientific literature. NeuML/pubmedbert-base-embeddings Sentence Similarity • 0.1B • Updated Jun 26 • 163k • • 157 NeuML/pubmedbert-base-embeddings-matryoshka Sentence Similarity • 0.1B • Updated Jun 26 • 2.39k • • 23 NeuML/pubmedbert-base-embeddings-8M Sentence Similarity • Updated Jun 26 • 7.78k • 9 NeuML/pubmedbert-base-colbert Sentence Similarity • 0.1B • Updated 12 days ago • 206 • 6
NeuML/pubmedbert-base-embeddings-matryoshka Sentence Similarity • 0.1B • Updated Jun 26 • 2.39k • • 23
Medical Embeddings M2V Models distilled with Model2Vec - 100K / 500K / 1M / 2M / 8M parameter variants. NeuML/pubmedbert-base-embeddings-8M Sentence Similarity • Updated Jun 26 • 7.78k • 9 NeuML/pubmedbert-base-embeddings-2M Sentence Similarity • Updated Jun 26 • 37 • 3 NeuML/pubmedbert-base-embeddings-1M Sentence Similarity • Updated Jun 26 • 43 • 2 NeuML/pubmedbert-base-embeddings-500K Sentence Similarity • Updated Jun 26 • 56 • 2
Text to Speech (TTS) Text to Speech (TTS) models compatible with txtai's TextToSpeech pipeline. NeuML/kokoro-base-onnx Text-to-Speech • Updated Mar 21 • 3 NeuML/kokoro-fp16-onnx Text-to-Speech • Updated Mar 21 • 4 NeuML/kokoro-int8-onnx Text-to-Speech • Updated Mar 21 • 10 NeuML/ljspeech-jets-onnx Text-to-Speech • Updated Oct 10, 2024 • 1.02k • 25
Wikipedia Embeddings indexes and datasets for Wikipedia data. NeuML/wikipedia Updated Jan 11, 2024 • 59 • 5 NeuML/wikipedia-20250620 Viewer • Updated Jul 3 • 4.81M • 186 • 3 NeuML/txtai-wikipedia Sentence Similarity • Updated Nov 17 • 176 • 75 NeuML/txtai-wikipedia-slim Sentence Similarity • Updated Nov 17 • 46 • 5
Word Vectors Legacy word vectors (FastText, GloVe, Word2Vec) stored in the StaticVectors format NeuML/fasttext Sentence Similarity • Updated Jan 26 • 353 • 1 NeuML/fasttext-quantized Sentence Similarity • Updated Jan 26 • 16 • 2 NeuML/glove-6B Sentence Similarity • Updated Jan 26 • 213 • 3 NeuML/glove-6B-quantized Sentence Similarity • Updated Jan 26 • 3.77k • 3