All the following instructions are tailored for systems with A100 GPU (sm80
), update accordingly with your hardware environments.
PyTorch
Latest the official guides: https://github.com/pytorch/pytorch?tab=readme-ov-file#from-source.
dependencies
git clone [email protected]:pytorch/pytorch.git
cd pytorch
git submodule sync
git submodule update --init --recursive
conda create -n storch python=3.13
conda activate storch
# Install CUDA, including CUDA runtime, compiler and cuDNN
After resolving the software dependencies, build PyTorch 1:
CMAKE_PREFIX_PATH="${CONDA_PREFIX:-'$(dirname $(which conda))/../'}:${CMAKE_PREFIX_PATH}" \
CMAKE_EXPORT_COMPILE_COMMANDS=ON \
CMAKE_BUILD_PARALLEL_LEVEL=32 \
USE_CUDA=1 TORCH_CUDA_ARCH_LIST="8.0" USE_FBGEMM=0 \
REL_WITH_DEB_INFO=1 \
python setup.py develop 2>&1 | tee /tmp/torch_build.log
Then validate the installation with:
python3 -c "import torch; print(torch.__file__); print(torch.cuda.is_available())"
SGLang
SGLang depends on FlashInfer, a GPU kernel library for efficient LLM inference. So we need to install it first:
git clone [email protected]:flashinfer-ai/flashinfer.git --recursive
cd flashinfer
git submodule update --init --recursive
TORCH_CUDA_ARCH_LIST="8.0" FLASHINFER_ENABLE_AOT=1 pip install --no-build-isolation --verbose --editable .
After successful installation of FlashInfer, you can step into SGLang:
git clone [email protected]:sgl-project/sglang.git && cd sglang
cd sgl-kernel
pip install build
# make build invokes uv, but I don't want to install another Python package manager
CMAKE_EXPORT_COMPILE_COMMANDS=1 CMAKE_PREFIX_PATH=$CONDA_PREFIX/lib/python3.13/site-packages \
python -m build --wheel --outdir build --no-isolation --verbose
Footnotes
-
FBGEMM fails to compile under GCC 12.2 in Debian 12, see: https://github.com/pytorch/pytorch/issues/77939#issuecomment-1528168416, we sidestep this by disabling FBGEMM integration in PyTorch. ↩