llama2 keras3 This respository is a Multi-Backend (Pytorch, Tensorflow, Jax) implementation of LLaMA using Keras3. Base on LLaMA-Lite. Implement the KVCache in simple code. Speed up the GPT Easy to convert to tflite. Inference Get the tinyllama model weights from HF. You can also try the Llama2 weights from Meta HF python performence.py