about
blog(current)

My Blog

random stuff

math

•

statistics

•

reinforcement-learning

Upper Confidence Bounds for Multi-Arm Bandits

Deriving Sutton and Barton's UCB Bandit Algoritmhs

4 min read · December 28, 2023

2023 · math statistics reinforcement-learning

© Copyright 2025 Daniel Solnik. Powered by Jekyll with al-folio theme. Hosted by GitHub Pages.