This repository serves to explore different machine learning tools, techniques and research. It will consist of Jupyter Notebooks used to try out new developments. It will have a focus on multimodal models. A list of ideas follows:
- Research on Causality
- Study and implement CNNs, RNNs, and Transformers.
- Explore multimodal learning with image and text data.
- Pretrain models using self-supervised learning techniques.
- Pretrain and finetune LLMs (e.g., GPT, BERT).
- Study optimizers, scaling laws, and distributed training.
- Work with vision models like ViT and ResNet.
- Fine-tune LLMs for NLP tasks (e.g., summarization).
- Train speech models like Wav2Vec and Whisper.
- Fine-tune multimodal architectures like CLIP and Flamingo.
- Study reinforcement learning and RLHF.
- Work with GANs, VAEs, and diffusion models.
- Implement RAG pipelines with LLMs and knowledge graphs.
- Experiment with cross-modal alignment techniques.
- Fine-tune pre-existing multimodal models.
- Train large foundation models using parameter-efficient fine-tuning.
- Build multimodal applications (e.g., video QA, chatbot).
- Study scaling techniques like model parallelism and Megatron-LM.
- Create and curate custom multimodal datasets.
- Deploy multimodal models using cloud services and APIs.