cd 'C:\Users\kbdsj\Virtual Environments\fyp-rl'
.\Scripts\activate
cd C:\Users\kbdsj\Desktop\Projects\FYP\RL-model\dev
pip freeze > requirements.txt
Each episode is a complete training session where:
- 
The agent starts with a random/QWERTY layout
 - 
Takes multiple actions (swaps) while learning
 - 
Updates its policy based on rewards