Reinforcement Learning (Part 1): Bandits

Every day, we interact with our world and make decisions based on experience. Whether eating out or whether to use the stairs or the elevator, every day we make a decision. Sometimes we’re doing what we know and sometimes we’re winging it with something new. These are a form of reinforcement learning. This form of learning is at the heart of most living things; with infants learning to walk by making mistakes and elephants in a zoo learning that electric fence must be kept away from. ...

February 23, 2025 · 11 min · 2230 words · Mwaura Collins

Rolling Your Own GPT

Training a GPT model sounds like a moonshot but it is actually just a series of simple and well-defined steps. At its core, it is just a giant text predictor, fed with tons of data and then allowed to guess the next word. The real challenge is not training it to predict the next word, but to make it produce something useful like answers to your assignment due midnight.😅 You need some special type of training like Reinforcement Learning with Human Feedback (RLHF). Though you can get good responses without RLHF, RLHF is required to make it chatbot-like. ...

February 4, 2025 · 15 min · 3146 words · Mwaura Collins

Unraveling RoPE: Encoding Relative Positions in Transformers with Elegance

Why Positional Encoding? Unlike recurrent neural networks (RNNs), transformers process tokens in parallel meaning they do not inherently understand the order of words in a sequence. In language, the meaning of a word can heavily depend on its position, for example, “Salt has important minerals.” and “The food is so bland I had to salt it.”. Salt is used as a noun and verb depending on it position on the sentence. ...

January 24, 2025 · 6 min · 1277 words · Mwaura Collins

Optimizers

What are optimizers? Optimizers are mathematical functions or algorithms that aim to reduce the loss value of a model. In simpler terms, optimizers reduce the loss after forward propagation of a model which makes supervised models learn and become ‘smart’. When training a model, the objective is to minimize the loss/cost function by iteratively updating the model parameters(Weights and Biases) at each epoch. This approach aims in reducing the loss by finding the global minima for the loss function. ...

January 21, 2025 · 8 min · 1664 words · Mwaura Collins

Autoencoders

What are autoencoders? An autoencoder is a neural network that reconstructs a high dimensional input through a compressed lower dimension bottleneck. The idea for autoencoders is to take a high dimensional input, compress it to a lower dimension that represents the image’s features, and then reconstruct the image from the bottleneck. The autoencoder is essentially like a dimensionality reduction method like PCA (Principal Component Analysis) The idea was originally from the 1980s and was later promoted by Hinton & Salakhutdinov, 2006[1] ...

January 15, 2025 · 11 min · 2265 words · Mwaura Collins