Skip to content
forked from Indoxer/LKAN

Variations of Kolmogorov-Arnold Networks

License

Notifications You must be signed in to change notification settings

udemirezen/LKAN

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Large Kolmogorov-Arnold Networks

Implementations of KAN variations.

Installation

WAY 1 (I don't tested):

installed python 3.10 + nvcc

pip install .

The best way:

Install conda https://conda.io/projects/conda/en/latest/user-guide/install/index.html

conda create -n lkan python==3.10
conda activate lkan
conda install cuda-nvcc
pip install .

To run mnist select config in main.py and run main.py.

To view charts, run tensorboard --logdir .experiments/

Info

Performance (rtx 2060 mobile, mnist):

MLP (31.8M parameters) - 51 it/s

KANLinear0 (32.3 M parameters) - 4.3 it/s

KANLinear (31M parameters) - 36.5 it/s

KANLinearFFT (33M parameters) - 40 it/s

KANLinearFFT CUDA (50% memory of KANLinearFFT for forward and backward) = 23 it/s

Docs

See examples/

continual_training_adam.ipynb, continual_training_lbfgs.ipynb - continual training

Problems

  • update_grid on cuda raise error (torch.linalg.lstsq assume full rank on cuda, only one algorithm) - solved temporary, moved calculating lstsq to cpu
  • update_grid_from_samples in original KAN run model multiple times, is it necessary?
  • parameters counting, is grid parameter or not?
  • MLP training is almost instant, but KAN train slow on start

TODO/Ideas:

  • Base structure
  • KAN simple implementation
  • KAN trainer
  • train KAN on test dataset
  • remove unnecessary dependencies in requirements.txt
  • test update_grid and "Other possibilities are: (a) the grid is learnable with gradient descent" from paper.
  • Regularization
  • Compare with MLP
  • Grid extension
  • MNIST
  • CIFAR10
  • KAN ResNet?
  • KAN as CNN filter?
  • KAN in VIT?
  • Fourier KAN?
  • GraphKAN
  • Mixing KAN and normal Layers.
  • pruning
  • test continual learning
  • docs and examples - write notebooks like in KAN repo.
  • KAN vs MLP in "LLM" - test?
  • CUDA kernel for b_splines?
  • unit tests?

Citations

@misc{liu2024kan,
      title={KAN: Kolmogorov-Arnold Networks}, 
      author={Ziming Liu and Yixuan Wang and Sachin Vaidya and Fabian Ruehle and James Halverson and Marin Soljačić and Thomas Y. Hou and Max Tegmark},
      year={2024},
      eprint={2404.19756},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Original KAN repo - base idea

efficient-kan - KANLinear and optimizations

About

Variations of Kolmogorov-Arnold Networks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 78.2%
  • C++ 11.1%
  • Cuda 10.7%