Rohan Sikand
This page documents my journey doing reinforcement learning research and a place to gather all my thoughts. In it, I link posts to several things like notes, research ideas, open problems I’m thinking about solving, experimentation logs, results etc.
We focus specifically on post-training foundation models with RL.
Note: this notebook is meant to be a rough draft of thoughts and experiments… not polished final writeups.
Some interests: