Reinforcement learning algorithms in machine learning
Reinforcement learning:- We have seen babies learn to walk without any prior knowledge of how to do it. Often we wonder how they really do it. They do it in a relatively simple way.
First, they notice somebody else walking around, for example, parents or anyone living around. They understand that legs have to be used, one at a time, to take a step. While walking, sometimes they fall down hitting an obstacle, whereas other times they are able to walk smoothly avoiding bumpy obstacles. When they are able to walk overcoming the obstacle, their parents are elated and appreciate the baby with loud claps / or maybe chocolates. When they fall down while circumventing an obstacle,
obviously their parents do not give claps or chocolates. Slowly a time comes when the babies learn from mistakes and are able to walk with much ease.
In the same way, machines often learn to do tasks autonomously. Let’s try to understand in the context of the example of the child learning to walk. The action tried to be achieved is walking, the child is the agent and the place with hurdles on which the child is trying to walk resembles the environment. It tries to improve its performance in doing the task. When a sub-task is accomplished successfully, a reward is given. When a sub-task is not executed correctly, obviously no reward is given. This continues till the machine is able to complete the execution of the whole task. This process of learning is known as reinforcement learning. Figure 1.10 captures the high-level process of reinforcement learning.
One contemporary example of reinforcement learning is self-driving cars. The critical information that it needs to take care of is speed and speed limit in different road segments, traffic conditions, road conditions, weather conditions, etc. The tasks that have to be taken care of are start/stop, accelerate/decelerate, turn to left/right, etc. Further details on reinforcement learning have been kept out of the scope of this book.
Points to ponder:
• Reinforcement learning is getting more and more attention from both industry and academia. Annual publications count in the area of reinforcement learning in Google Scholar to support this view.
• While Deep Blue used brute force to defeat the human chess champion, AlphaGo used RL to defeat the best human Go player.
• RL is an effective tool for personalized online marketing. It considers the demographic details and browsing history of the user in real-time to show the most relevant advertisements.