Skip to content

Vector Inference: Easy inference on Slurm clusters

This repository provides an easy-to-use solution to run inference servers on Slurm-managed computing clusters using vLLM. All scripts in this repository runs natively on the Vector Institute cluster environment. To adapt to other environments, update the environment variables in vec_inf/client/slurm_vars.py, and the model config for cached model weights in vec_inf/config/models.yaml accordingly.

Installation

If you are using the Vector cluster environment, and you don't need any customization to the inference server environment, run the following to install package:

pip install vec-inf

Otherwise, we recommend using the provided Dockerfile to set up your own environment with the package.