Vector Inference: Easy inference on Slurm clusters¶
This repository provides an easy-to-use solution to run inference servers on Slurm-managed computing clusters using vLLM. All scripts in this repository runs natively on the Vector Institute cluster environment. To adapt to other environments, update the environment variables in vec_inf/client/_vars.py
, vec_inf/client/_config.py
, vllm.slurm
, multinode_vllm.slurm
and models.yaml
accordingly.
Installation¶
If you are using the Vector cluster environment, and you don't need any customization to the inference server environment, run the following to install package:
Otherwise, we recommend using the provided Dockerfile
to set up your own environment with the package.