Skip to content

Mach-A/RL_use_case

Repository files navigation

Reinforcement Learning - [Q_learning on multiple use case environment set up: Traffic Light Control System & Trading agent)

Defining and Solving RL Environments

Worked on Traffic Light Control Scenario for both Stochastic and Deterministic environment using Q learning; Q-learning function: Q(s,a)←Q(s,a)+α[r+γargmax​Q(s′,a′)−Q(s,a)] This was updated eg previously the agent allowed only 1 car to pass at a time, which isn't effective in real world, the controller has been updated to allow all cars from opposite poles to pass at a time, this better models the real world.

Then further applied another Tabular learning method of Monte Carlo(MC) Control Method, using the EveryVist MC Method Monte-Carlo Learnin function Q(s,a)←Q(s,a)+α[Gt​−Q(s,a)]

Also an agent was trained on the NVDA stock prices within a period and the agent traded with additional details and directions In evaluating the agent, I had to introduce some randomness especially where the agent goes for greedy actions as against the agent always repeating same start day and policy More interesting, the nvda stock agent actually performed better with this bit of randomness