jeff's blog

Link Collection 2

Some links and papers that I have found interesting this week. If you have any comments, please let me know.

1: Finetune Mistral-7B using qLora This is a notebook authored by Brevdev that shows how to finetune the Mistral-7B model using qLora. The notebook is very well documented and easy to follow. I have not tried it yet, but I will do it soon.

2: A tour of parallelism in JAX I have been trying to learn JAX for some time now, and I have found this notebook very useful. It is a very good introduction to parallelism in JAX.

3: Use GPT-4V for data labeling This repository authored by Roboflow shows how to use GPT-4V for data labeling. The repository is very well documented and easy to follow. The repo is in an early stage but I think it is a very interesting idea. See the following image for and example of the results. Multimodal Maestro

4: GPT-fast This blogpost by the PyTorch team shows how to use pure PyTorch to accelerate inference of LLMs using Torch.compile, GPU quantization Speculative Decoding and Tensor Parallelism. The results are very impressive almost 10x faster than the baseline. See the following image for the results. LlaMA fast inference.

5: Extending the context of LLMs This reddit post on the r/LocalLLaMA subreddit shows how to extend the context of LLMs using the self-extend method. The results are very interesting. See the following image for the results for the Phi-2 model. Phi-2 self-extend

6: LLM+CLIP for image captioning This notebook authored by Katherine Crowson shows how to use LLM+CLIP for image captioning. The idea of using gradient descent and PEFT to find the caption that most closely matches the CLIP image embeddings is very interesting, and the results are very suprisingly good, but I think that doing SGD for every image is not very efficient.


comments powered by Disqus