Recent posts

New Turkish Pre-training Datasets

5 minute read

Language model pre-training requires massive amount of high quality text. For Turkish language pre-training there are not many options available.

Haber GPT-2

4 minute read

HaberGPT2 is a decoder only language model with 100M parameters. It is trained from scratch using turkish news. This post shares details about training proce...

Transformer Pre-training Notes

32 minute read

In this post, I have compiled architectural and training details of prominent language models. All information is based on the published papers.

Complex Numbers

less than 1 minute read

Standard Form \(z = x + iy\)