Installation¶
Vortex plugs into a vendored SGLang (under third_party/). Install SGLang
in editable mode first, then Vortex. Installation is CPU-only — all kernels are
prebuilt wheels or JIT-compiled at runtime — so it works even while the GPUs are
busy.
From source¶
git clone --recursive https://github.com/Infini-AI-Lab/vortex_torch.git
cd vortex_torch
# 1. SGLang dependency (vendored, editable)
cd third_party/sglang/v0.5.9/sglang
pip install -e "python"
cd ../../../../
# 2. Vortex (editable)
pip install -e .
If you cloned without --recursive, pull the submodules first:
git submodule update --init --recursive
Reproducible conda environment (recommended)¶
The repo ships a one-shot script that builds the exact tested environment — Python 3.12, torch 2.9.1+cu128, flashinfer 0.6.3, transformers 4.57.1, plus editable SGLang and Vortex:
bash install_vortex.sh # creates the `vortex_v1` conda env
conda activate vortex_v1
Note
For MLA models such as GLM-4.7-Flash (HF type glm4_moe_lite, which
requires transformers >= 5.0), use install_vortex_glm.sh instead — it builds
a separate vortex_glm env that overrides transformers with a GLM-supporting
build.
Verify¶
python -c "import torch, sglang, vortex_torch, flashinfer; print('vortex ok')"
You should see vortex ok with no import errors. You’re ready for the
Quick Start.