Huseyin ABANOZ
Passionate Software Engineer
- Turkiye
You may also enjoy
Haber GPT-2
4 minute read
HaberGPT2 is a decoder only language model with 100M parameters. It is trained from scratch using turkish news. This post shares details about training proce...
Transformer Pre-training Notes
38 minute read
In this post, I have compiled architectural and training details of prominent language models. All information is based on the published papers.
Complex Numbers
less than 1 minute read
Standard Form \(z = x + iy\)
Demystifying Neural Network Activations In Pytorch
8 minute read
Backpropagation is a crucial part of the deep learning training process. It allows us to compute gradients and update our model parameters. This post is not ...