Online and Reinforcement Learning

My work in online and reinforcement learning is about models that must keep improving while decisions are being made. These settings are often constrained, dynamic, and expensive to simulate: orbital systems, edge-computing task offloading, target localization, and online convex objectives all require algorithms that learn from sequential feedback without assuming a static, fully observed world.

The projects in this topic combine algorithmic guarantees with benchmark design and applied decision-support systems. I am particularly interested in how online learning and reinforcement learning can be made reliable enough for infrastructure-like settings, where exploration, uncertainty, and operational cost have to be handled carefully.

Publications in this topic