Tumor angiogenesis models based on coupled nonlinear parabolic partial differential equations require solving stiff systems where explicit time-stepping methods impose severe stability constraints on the time step size. Implicit–Explicit (IMEX) schemes relax this constraint by treating diffusion terms implicitly and reaction–chemotaxis terms explicitly, reducing each time step to a single linear system solution. However, standard Gaussian elimination with partial pivoting exhibits cubic complexity in the number of spatial grid points, dominating computational cost for realistic discretizations in the range of 400–800 grid points. This work presents a CUDA-based parallel algorithm that accelerates the IMEX scheme through GPU implementation of three core computational kernels: pivot finding via atomic operations on double-precision floating-point values, row swapping with coalesced memory access patterns, and elimination updates using optimized two-dimensional thread grids. Performance measurements on an NVIDIA H100 GPU demonstrate speedup factors, achieving speedup factors from 3.5× to 113× across spatial discretizations spanning (Formula presented.) grid points relative to sequential CPU execution, approaching 94.2% of the theoretical maximum speedup predicted by Amdahl’s law. Numerical validation confirms that GPU and CPU solutions agree to within twelve digits of precision over extended time integration, with conservation properties preserved to machine precision. Performance analysis reveals that the elimination kernel accounts for nearly 90% of total execution time, justifying the focus on GPU parallelization of this component. The method enables parameter studies requiring (Formula presented.) PDE solves, previously computationally prohibitive, facilitating model-driven investigation of anti-angiogenic therapy design.
A GPU-CUDA Numerical Algorithm for Solving a Biological Model
De Luca, Pasquale
;Fiorillo, Giuseppe;Marcellino, Livia
2025-01-01
Abstract
Tumor angiogenesis models based on coupled nonlinear parabolic partial differential equations require solving stiff systems where explicit time-stepping methods impose severe stability constraints on the time step size. Implicit–Explicit (IMEX) schemes relax this constraint by treating diffusion terms implicitly and reaction–chemotaxis terms explicitly, reducing each time step to a single linear system solution. However, standard Gaussian elimination with partial pivoting exhibits cubic complexity in the number of spatial grid points, dominating computational cost for realistic discretizations in the range of 400–800 grid points. This work presents a CUDA-based parallel algorithm that accelerates the IMEX scheme through GPU implementation of three core computational kernels: pivot finding via atomic operations on double-precision floating-point values, row swapping with coalesced memory access patterns, and elimination updates using optimized two-dimensional thread grids. Performance measurements on an NVIDIA H100 GPU demonstrate speedup factors, achieving speedup factors from 3.5× to 113× across spatial discretizations spanning (Formula presented.) grid points relative to sequential CPU execution, approaching 94.2% of the theoretical maximum speedup predicted by Amdahl’s law. Numerical validation confirms that GPU and CPU solutions agree to within twelve digits of precision over extended time integration, with conservation properties preserved to machine precision. Performance analysis reveals that the elimination kernel accounts for nearly 90% of total execution time, justifying the focus on GPU parallelization of this component. The method enables parameter studies requiring (Formula presented.) PDE solves, previously computationally prohibitive, facilitating model-driven investigation of anti-angiogenic therapy design.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


