New Turkish Pre-training Datasets
Language model pre-training requires massive amount of high quality text. For Turkish language pre-training there are not many options available.
Language model pre-training requires massive amount of high quality text. For Turkish language pre-training there are not many options available.
HaberGPT2 is a decoder only language model with 100M parameters. It is trained from scratch using turkish news. This post shares details about training proce...
In this post, I have compiled architectural and training details of prominent language models. All information is based on the published papers.
Standard Form \(z = x + iy\)
Backpropagation is a crucial part of the deep learning training process. It allows us to compute gradients and update our model parameters. This post is not ...