Stable and Efficient Gaussian-Based Kolmogorov–Arnold Networks

De Luca, Pasquale; Di Nardo, Emanuel; Marcellino, Livia; Ciaramella, Angelo

doi:10.3390/math14030513

Kolmogorov–Arnold Networks employ learnable univariate activation functions on edges rather than fixed node nonlinearities. Standard B-spline implementations require (Formula presented.) parameters per layer (K basis functions, W connections). We introduce shared Gaussian radial basis functions with learnable centers (Formula presented.) and widths (Formula presented.) maintained globally per layer, reducing parameter complexity to (Formula presented.) for L layers—a threefold reduction, while preserving Sobolev convergence rates (Formula presented.). Width clamping at (Formula presented.) and tripartite regularization ensure numerical stability. On MNIST with architecture (Formula presented.) and (Formula presented.), RBF-KAN achieves (Formula presented.) test accuracy versus (Formula presented.) for B-spline KAN with (Formula presented.) speedup and 33% memory reduction, though generalization gap increases from (Formula presented.) to (Formula presented.) due to global Gaussian support. Physics-informed neural networks demonstrate substantial improvements on partial differential equations: elliptic problems exhibit a (Formula presented.) reduction in PDE residual and maximum pointwise error, decreasing from (Formula presented.) to (Formula presented.) ; parabolic problems achieve a (Formula presented.) accuracy gain; hyperbolic wave equations show a (Formula presented.) improvement in maximum error and a (Formula presented.) reduction in (Formula presented.) norm. Superior hyperbolic performance derives from infinite differentiability of Gaussian bases, enabling accurate high-order derivatives without polynomial dissipation. Ablation studies confirm that coefficient regularization reduces mean error by 40%, while center diversity prevents basis collapse. Optimal basis count (Formula presented.) balances expressiveness and overfitting. The architecture establishes Gaussian RBFs as efficient alternatives to B-splines for learnable activation networks with advantages in scientific computing.