Reinforcement Learning for Data Selection

Using a reinforcement learning policy to create a distribution over the training data for use in a supervised model. For use in applications with bad data quality/labeling.