Recent posts

LLM Scaling Laws

4 minute read

Training language models is an expensive business and it is important to plan carefully ahead of training. This post will briefly touch studies on scaling la...

LSH with Jaccard Index

3 minute read

Minhash algorithm can be used to detect near duplicate documents. Minhash algorithm works by calculating multiple hashes for different sections of a document...