BGK1D1V
BGK1D1V is the one-dimensional-in-space, one-dimensional-in-velocity BGK model. In the implementation, the local relaxation time is reconstructed as
\[\tau(x,t) = \frac{\tilde{\tau}}{n(x,t)\sqrt{T(x,t)}}\]rather than being prescribed directly as a field.
State and quadrature
State1D1V stores:
f,f_tmp,f_m- uniform grids
x,v - macroscopic moments
n,u,T - cached masks and coefficients used by transport kernels
Moment evaluation is performed by direct discrete quadrature,
\[n = \int f\,dv, \qquad \nu = \int vf\,dv, \qquad u = \frac{\nu}{n}, \qquad T = \frac{1}{n}\int v^2 f\,dv - u^2.\]Boundary handling
All current BGK1D steppers keep the boundary distribution fixed at each time step.
- explicit copies the left and right boundary cells after the update;
- implicit caches
f_bc_landf_bc_rand keeps them fixed during Picard iteration; - holo overwrites the boundary cells after the HO solve.
This is therefore not a periodic implementation.
Backend support
| Scheme | torch | cuda_kernel | cpu_kernel |
|---|---|---|---|
explicit | supported | supported | not registered |
implicit | not registered | supported | supported |
holo | not registered | supported | not registered |
Explicit stepper
The explicit BGK1D stepper performs:
- moment evaluation,
- Maxwellian reconstruction,
- local relaxation-time construction,
- first-order upwind transport,
- collision update,
- boundary fixation and buffer swap.
The torch version is a transparent reference implementation; the cuda_kernel version delegates the heavy work to a fused extension.
Implicit stepper
The implicit stepper is implemented in both CUDA and CPU extension variants. The algorithmic structure is the same in both cases.
- copy the current distribution into the implicit workspace,
- compute initial moments
W=(n, \nu, T), - optionally inject a prescribed initial
Wor a CNN-based warm-start, - build the tridiagonal coefficients from the current moments,
- solve the batched tridiagonal systems along the spatial direction,
- recompute moments and test convergence,
- update
Wdirectly or through Anderson acceleration.
The convergence metric can be chosen as either distribution-based (conv_type="f") or moment-based (conv_type="w").
CNN warm-start
When moments_cnn_modelpath is set, the implicit stepper loads a checkpoint and predicts moment increments before Picard iteration begins. The current implementation supports:
- primitive or conservative input channels,
- optional temporal-history augmentation through
prev_delta, dwanddnuoutput conventions,- gradient-based attenuation through
warm_delta_weight_mode="w_grad".
HOLO stepper
The HOLO implementation uses an outer high-order iteration and an inner low-order moment solve.
- The HO layer computes face moments and a heat-flux-like term
Qfrom the current distribution. - The LO layer solves a 3x3 block-tridiagonal system for
W=(n, \nu, U). - The distribution update is then obtained from a tridiagonal solve using the LO moments.
The implementation uses theta=0.5, i.e. a Crank-Nicolson-like split.