Vector Inference: Easy inference on Slurm clusters¶
This repository provides an easy-to-use solution to run inference servers on Slurm-managed computing clusters using vLLM. All scripts in this repository runs natively on the Vector Institute cluster environment. To adapt to other environments, update launch_server.sh
, vllm.slurm
, multinode_vllm.slurm
and models.csv
accordingly.
Installation¶
If you are using the Vector cluster environment, and you don’t need any customization to the inference server environment, run the following to install package:
pip install vec-inf
Otherwise, we recommend using the provided Dockerfile
to set up your own environment with the package.