Added CatLIP paper links

Done on 2024-04-25 based on 8db2c173f3d8af69eed7c85b94ed2655f8931fa1
Vikrant-Khedkar · Apr 25, 2024 · 0333b1f · 0333b1f
1 parent 343b6bd
commit 0333b1f
Show file tree

Hide file tree

Showing 2 changed files with 12 additions and 5 deletions.
diff --git a/README.md b/README.md
@@ -23,11 +23,10 @@ CoreNet is a deep neural network toolkit that allows researchers and engineers t
 
 ## Research efforts at Apple using CoreNet
 
-Below is the list of publications from Apple that uses CoreNet:
+Below is the list of publications from Apple that uses CoreNet. Also, training and evaluation recipes, as well as links to pre-trained models, can be found inside the [projects](./projects/) folder. Please refer to it for further details.
 
    * [OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework](https://arxiv.org/abs/2404.14619)
-<!-- TODO: add url for CatLIP -->
-   * [CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data]()
+   * [CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data](https://arxiv.org/abs/2404.15653)
    * [Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement](https://arxiv.org/abs/2303.08983)
    * [CLIP meets Model Zoo Experts: Pseudo-Supervision for Visual Enhancement](https://arxiv.org/abs/2310.14108)
    * [FastVit: A Fast Hybrid Vision Transformer using Structural Reparameterization](https://arxiv.org/abs/2303.14189)

diff --git a/projects/catlip/README.md b/projects/catlip/README.md
@@ -1,6 +1,7 @@
 # CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data 
+[![arXiv](https://img.shields.io/badge/arXiv-2404.15653-a6dba0.svg)](https://arxiv.org/abs/2404.15653)
 
-[CatLIP]() introduces a novel weakly supervised pre-training approach for vision models on web-scale noisy image-text data, *reframing pre-training as a classification task to circumvent computational challenges associated with pairwise similarity computations in contrastive learning*, resulting in a significant 2.7x acceleration in training speed while maintaining high representation quality across various vision tasks.
+`CatLIP` introduces a novel weakly supervised pre-training approach for vision models on web-scale noisy image-text data, *reframing pre-training as a classification task to circumvent computational challenges associated with pairwise similarity computations in contrastive learning*, resulting in a significant 2.7x acceleration in training speed while maintaining high representation quality across various vision tasks.
 
 We provide training and evaluation code along with pretrained models and configuration files for the following tasks:
 
@@ -15,7 +16,14 @@ We provide training and evaluation code along with pretrained models and configu
 If you find our work useful, please cite:
 
 ```BibTex 
-# TODO(sachin): Add CatLIP citation
+@article{mehta2024catlip,
+  title={CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data}, 
+  author={Sachin Mehta and Maxwell Horton and Fartash Faghri and Mohammad Hossein Sekhavat and Mahyar Najibi and Mehrdad Farajtabar and Oncel Tuzel and Mohammad Rastegari},
+  year={2024},
+  eprint={2404.15653},
+  archivePrefix={arXiv},
+  primaryClass={cs.CV}
+}
 
 @inproceedings{mehta2022cvnets, 
      author = {Mehta, Sachin and Abdolhosseini, Farzad and Rastegari, Mohammad},