BiomedBERT ColBERT

This is a PyLate model finetuned from microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext. It maps sentences & paragraphs to sequences of 128-dimensional dense vectors and can be used for semantic textual similarity using the MaxSim operator.

Usage (txtai)

This model can be used to build embeddings databases with txtai for semantic search and/or as a knowledge source for retrieval augmented generation (RAG).

import txtai

embeddings = txtai.Embeddings(
  path="neuml/biomedbert-base-colbert",
  content=True
)
embeddings.index(documents())

# Run a query
embeddings.search("query to run")

Late interaction models excel as reranker pipelines.

from txtai.pipeline import Reranker, Similarity

similarity = Similarity(path="neuml/biomedbert-base-colbert", lateencode=True)
ranker = Reranker(embeddings, similarity)
ranker("query to run")

Usage (PyLate)

Alternatively, the model can be loaded with PyLate.

from pylate import rank, models

queries = [
    "query A",
    "query B",
]

documents = [
    ["document A", "document B"],
    ["document 1", "document C", "document B"],
]

documents_ids = [
    [1, 2],
    [1, 3, 2],
]

model = models.ColBERT(
    model_name_or_path="neuml/biomedbert-base-colbert",
)

queries_embeddings = model.encode(
    queries,
    is_query=True,
)

documents_embeddings = model.encode(
    documents,
    is_query=False,
)

reranked_documents = rank.rerank(
    documents_ids=documents_ids,
    queries_embeddings=queries_embeddings,
    documents_embeddings=documents_embeddings,
)

Evaluation Results

Performance of these models are compared to previously released models trained on medical literature. The most commonly used small embeddings model is also included for comparison.

The following datasets were used to evaluate model performance.

  • PubMed QA
    • Subset: pqa_labeled, Split: train, Pair: (question, long_answer)
  • PubMed Subset
    • Split: test, Pair: (title, text)
  • PubMed Summary
    • Subset: pubmed, Split: validation, Pair: (article, abstract)

Evaluation results are shown below. The Pearson correlation coefficient is used as the evaluation metric.

Model PubMed QA PubMed Subset PubMed Summary Average
all-MiniLM-L6-v2 90.40 95.92 94.07 93.46
bioclinical-modernbert-base-embeddings 92.49 97.10 97.04 95.54
biomedbert-base-colbert 94.59 97.18 96.21 95.99
biomedbert-base-reranker 97.66 99.76 98.81 98.74
pubmedbert-base-embeddings 93.27 97.00 96.58 95.62
pubmedbert-base-embeddings-8M 90.05 94.29 94.15 92.83

This is the best performing model we've released that's not a cross-encoder. With MUVERA encoding, this model can be used to index large datasets for semantic search. It can also be used as a faster re-ranker vs. a cross-encoder model.

Full Model Architecture

ColBERT(
  (0): Transformer({'max_seq_length': 511, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Dense({'in_features': 768, 'out_features': 128, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity', 'use_residual': False})
)
Downloads last month
10
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NeuML/biomedbert-base-colbert

Collections including NeuML/biomedbert-base-colbert