Skip to article frontmatterSkip to article content

Reinforcement Learning

As I wrote in the introduction section, the evaporation of the distinction between a benchmark and an eval foreshadows the evaporation of the distinction between an eval and an RL env.