Home Triton 编译安装
Post
Cancel

Triton 编译安装

配置python环境

建议使用conda配置,选择python3.10会稳定些

conda create -n triton_env python=3.10

  • 根据cuda版本安装pytorch(gpu版)

例如我用的是cuda11.8,那么

1
conda install pytorch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 pytorch-cuda=11.8 -c pytorch -c nvidia

详细见官网:https://pytorch.org/get-started/previous-versions/

  • 安装常见的包

numpy matplotlib pybind11 lit pytest isort pandas tabulate scipy flake8 autopep8

pybind11安装后需要配置环境变量,否则会找不到头文件

1
export PYBIND_INCLUDE_PATH=/xxxx/miniconda/envs/triton_env/lib/python3.10/site-packages/pybind11/include

下面的源挺好用的 vim ~/.condarc

1
2
3
4
5
6
7
show_channel_urls: true
channels:
  - http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/
  - http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
  - http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
  - defaults
auto_activate_base: false

捷径

1
git clone https://github.com/triton-lang/triton.git

clone llvm 很难搞,如果不用修改源码,就直接安装吧

1
2
3
pip install git+https://github.com/LLNL/hatchet

pip install -U --index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/Triton-Nightly/pypi/simple/ triton-nightly --use-deprecated legacy-resolver

运行一下

1
2
3
4
5
6
7
8
9
Python 3.10.14 (main, May  6 2024, 19:42:50) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import triton
>>> triton.__version__
'3.0.0'


cd triton/python/tutorials/
python 03-matrix-multiplication.py

编译llvm

1
2
git clone https://github.com/triton-lang/triton.git
git clone https://github.com/llvm/llvm-project.git

如果拉取出现下面报错,在repo内输入 git config --global http.postBuffer 1024288000

1
2
3
4
5
remote: Compressing objects: 100% (1151/1151), done.
error: RPC failed; result=18, HTTP code = 200| 592.00 KiB/s
fatal: The remote end hung up unexpectedly
fatal: 过早的文件结束符(EOF)
fatal: index-pack failed
  • 切换llvm commit

git checkout xxx,其中xxx是triton对应的llvm版本号,可以使用 cat triton/cmake/llvm-hash.txt 找到

  • build

cmake 版本要求3.20以上,记得安装ninja。如果没有root权限就下编译好的二进制,解压后加PATH即可。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
cd xxxpath/llvm-project
mkdir build && cd build

cmake -G Ninja ../llvm \
     -DLLVM_ENABLE_PROJECTS="mlir;llvm" \
     -DLLVM_BUILD_EXAMPLES=ON \
     -DLLVM_TARGETS_TO_BUILD="X86;NVPTX;AMDGPU" \
     -DMLIR_ENABLE_CUDA_RUNNER=ON \
     -DCMAKE_BUILD_TYPE=Release \
     -DLLVM_ENABLE_ASSERTIONS=ON \
     -DLLVM_ENABLE_RTTI=ON \
     -DLLVM_INSTALL_UTILS=ON \
     -DMLIR_INCLUDE_INTEGRATION_TESTS=ON \
     -DCMAKE_INSTALL_PREFIX="xxxpath/tools_build/llvm"

ninja -j32
ninja install

cmake --build . --target check-mlir

编译时target只能是”X86;NVPTX;AMDGPU”,如果多了riscv,后续编译出的libtriton.so是有问题的,会报错

1
ImportError: /lustre/S/ruantingfeng/triton/triton_repo/python/triton/_C/libtriton.so: undefined symbol: LLVMInitializeRISCVAsmParser
  • 增加环境变量在.bashrc
1
2
3
4
5
export PATH=xxxpath/tools_build/llvm/bin:$PATH
export LLVM_BUILD_DIR=xxxpath/tools_build/llvm
export LLVM_INCLUDE_DIRS=$LLVM_BUILD_DIR/include
export LLVM_LIBRARY_DIR=$LLVM_BUILD_DIR/lib
export LLVM_SYSPATH=$LLVM_BUILD_DIR

编译triton

1
2
3
4
5
6
cd xxxpath/triton
conda actiave triron_env
LLVM_INCLUDE_DIRS=$LLVM_BUILD_DIR/include \
  LLVM_LIBRARY_DIR=$LLVM_BUILD_DIR/lib \
  LLVM_SYSPATH=$LLVM_BUILD_DIR \
  pip install -e python

编译好的内容在 xxxpath/triton/python/build

而且 libtriton.so 已经加到 _c 中了

1
2
$ ll python/triton/_C/
include/      libtriton.so

再加个环境变量

1
2
export TRITON_HOME=/lustre/S/ruantingfeng/triton/triton_repo
export PYTHONPATH=$TRITON_HOME/python:${PYTHONPATH}

测试一下,没啥问题就可以运行 python/tutorials 中的测试(跑了一下03-matrix-multiplication.py,看起来暂时干不过cuBLAS)

1
2
3
4
5
6
7
$ python
Python 3.10.14 (main, Mar 21 2024, 16:24:04) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import triton
>>> triton.__version__
'3.0.0'
>>>
This post is licensed under CC BY-NC-SA 4.0 by the author.

GPU Arch Base Knowledge

AI Compiler 中的Auto-Parallelism