slide2vec

Efficient encoding of whole-slide images using publicly available foundation models

A Python package for extracting embeddings from whole-slide images using public pathology foundation models.
Builds on hs2p for preprocessing.

Installation

pip install slide2vec

Quick start

from slide2vec import Model, PreprocessingConfig

model = Model.from_pretrained("PRISM")
preprocessing = PreprocessingConfig(
    target_spacing_um=0.5,
    target_tile_size_px=224,
    tissue_threshold=0.1,
)
embedded = model.embed_slide("/path/to/slide.svs", preprocessing=preprocessing)
tile_embeddings = embedded.tile_embeddings # (N, 2560)
slide_embedding = embedded.slide_embedding # (1280)
slide_latents = embedded.latents # (512, 1280)

Use Pipeline(...) for manifest-driven batch processing with artifacts written to disk.

Outputs

  • tile_embeddings/{sample_id}.pt or .npz + .meta.json
  • slide_embeddings/{sample_id}.pt or .npz + .meta.json
  • optional slide_latents/{sample_id}.pt or .npz

Features

  • 10 tile-level and 3 slide-level foundation models with preset configs
  • Python API and CLI with manifest-driven batch processing
  • Multi-GPU support with automatic distribution
  • Docker support