CUDA Kernels
Target directory
src/kineticEQ/cuda_kernel/
Loaders in compile.py
load_explicit_fusedload_implicit_fusedload_gtsvload_lo_blocktridiagload_implicit_AA
BGK1D1V kernels
explicit_fused/
A fused explicit one-step update. The binding requires CUDA tensors with torch.float64.
implicit_fused/
Provides moments_n_nu_T and build_system_from_moments. The binding also requires CUDA tensors with torch.float64.
gtsv/
Wraps cuSPARSE batched tridiagonal solves.
lo_blocktridiag/
Solves the 3x3 block-tridiagonal systems used by the HOLO low-order layer.
implicit_AA/
Implements Anderson acceleration for implicit Picard iterations.