Software

BioDSA

BioDSA: AI agents for biomedical data science

  • Agents for biomedical data science: hypothesis generation and validation over medical and biomedical data.
  • Includes the BioDSA-1K benchmark for evaluating data science agents in biomedical research.
LEADS

LEADS: a specialized LLM for medical literature mining

PyTrial

PyTrial: A Python Package for AI for Drug Development

  • A comprehensive package for various AI applications for drug development, especially in clinical trials.
  • Achieves tasks such as patient outcome prediction, synthetic patient record generation, patient-trial matching, trial site selection, trial outcome prediction, and trial similarity search.
  • Complete tutorials with examples for a fast ramp-up in using this package.
TransTab

TransTab: Transferable Tabular Transformers

  • A flexible tabular learning and prediction method that supports learning from variable-column tables.
  • Supports various tasks such as tabular pretraining, transfer learning, and zero-shot prediction.
MedCLIP

MedCLIP: Pretrained Medical Vision-Language Model

  • Contrastive pretraining of medical VLM on unpaired X-rays and radiology reports.
  • Provides off-the-shelf pretrained VLM for further use.
Trial2Vec

Trial2Vec: Clinical Trial Similarity Search

  • Self-supervised training to encode clinical trial documents into semantically meaningful dense embeddings.
  • Provides off-the-shelf pretrained language models for trial retrieval and pretrained embeddings for downstream tasks.
PromptEHR

PromptEHR: Synthetic EHR Generation

  • Large language models as neural databases to memorize and generate electronic healthcare records (EHRs).
  • Provides off-the-shelf pretrained language models for synthetic EHR generation.