Collecting 1.8M news documents from Common Crawl
2 Months ago, I trained a tiny language model with 10M parameters on turkish news datasets [1]. However, 10M parameters turned out to be not enough to genera...
2 Months ago, I trained a tiny language model with 10M parameters on turkish news datasets [1]. However, 10M parameters turned out to be not enough to genera...
Minhash algorithm can be used to detect near duplicate documents. Minhash algorithm works by calculating multiple hashes for different sections of a document...
MinHash algorithm is used to identify near-duplicate documents in a training corpus.
HaberGPT is a simple framework influenced greatly by the works of Andrej Karpathy, namely mingpt and nanogpt.
I am inspired from the make more lecture of Andrej Karpathy. This post aims to showcase similarities between count based and gradient based methods to genera...