habanoz’s tech posts

LLM Training vs Serving in Terms of Memory Access Patterns

7 minute read

This article investigates differences of LLM training vs serving in terms of memory access patterns.

New Turkish Pre-training Datasets

5 minute read

Language model pre-training requires massive amount of high quality text. For Turkish language pre-training there are not many options available.

Haber GPT-2

4 minute read

HaberGPT2 is a decoder only language model with 100M parameters. It is trained from scratch using turkish news. This post shares details about training proce...

Transformer Pre-training Notes

32 minute read

In this post, I have compiled architectural and training details of prominent language models. All information is based on the published papers.

Complex Numbers

less than 1 minute read

Standard Form \(z = x + iy\)

Huseyin ABANOZ

Recent posts

LLM Training vs Serving in Terms of Memory Access Patterns

New Turkish Pre-training Datasets

Haber GPT-2

Transformer Pre-training Notes

Complex Numbers