The Reforce Leaning based on Q-learning method, which is used in the interactive control of autos in the one single intersection. Easily speaking, the RL is one kinds of artificial intelligence, without man supervise, and after a vast number of exercise or trains while we give computer a goal and the correspondingly reward, it will do the things or the ways most efficient——find the something we what.
So, I make an interesting experiment with Sumo (a traffic simulation software) and the RL which play in python. First, we can use the TRACI to connect Sumo and python, after this, I can do anything I want in Sumo by python. The specific programming steps will not be introduced here, and the experiment’s result will show as follow.
The picture1 shows the reaction of a man who drive to the intersection with a relatively higher speed. And the picture2 shows the decision of RL agents after trains, obviously, it make speed change and through the cross more steadily——which do not stop at the line.