SELECTED REINFORCEMENT LEARNING METHODS APPLIED TO DETERMINE THE OPTIMAL PATH OF TRANSITION

Authors

DOI:

https://doi.org/10.26408/134.05

Keywords:

Artificial Intelligence, reinforcement learning, Q-learning, Sarsa, Adam optimizer

Abstract

The focus of this work is the determination of the optimal path for a mobile agent to take in an environment with static obstacles, using reinforcement learning (RL). The paper explains the work examining different RL algorithms, such as Q-learning and Sarsa in the classic version and enhanced with the Adam gradient optimiser. The work investigates the impact of the Adam gradient optimiser on the rate and stability of finding the optimal solution. The analysis includes a comparison of the learning rate, the number of steps in a single episode and the stability of the learning process. The results reveal that the considered Q-learning and Sarsa algorithms supplemented with the Adam optimiser achieve a higher performance, characterised by a faster determination of the optimal transition path, than the same algorithms without the Adam gradient optimiser. The results could be particularly useful in practical applications for routing transitions in fields like mobile robotics.

References

Adhirai, N., Kumar, S., 2022, Reinforcement Learning Based Path Planning for Mobile Robots Traversing in Unknown Environments, 6th National Conference on Recent Trends in Instrumentation and Control (RTIC 2022), Anna University, Chennai, India, pp. 101–106.

Alhawary, M., 2018, Reinforcement – Learning – Based Navigation for Autonomous Mobile Robots in Unknown Environments, Robotics and Mechatronics, University of Twente, Enschede, The Netherlands.

Bellman, R., 1957, A Markovian Decision Process, Journal of Mathematics and Mechanics, vol. 6, pp. 679–684.

Bellman, R.E., Dreyfus, S.E., 1962, Applied Dynamic Programming, Princeton University Press, Princeton, New Jersey, USA.

Cao, Y., Ni K., Kawaguchi, T., Hashimoto, S., 2024, Path Following for Autonomous Mobile Robots with Deep Reinforcement Learning, Sensors, vol. 24, no. 2, pp. 1–22.

Francois-Lavet, V., Henderson, P., Islam, R., Bellemare, M.G., Pineau J., 2018, An Introduction to Deep Reinforcement Learning, Foundations and Trends in Machine Learning, vol. 11, no. 3–4, pp. 219–354.

Garaffa, C., Basso, M., Konzen, A.A., de Freitas, E.P., 2021, Reinforcement Learning for Mobile Robotics Exploration: A Survey, IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 8, pp. 3796–3810.

Kingma, D.P., Ba, J.L. 2014, Adam: A Method for Stochastic Optimization, Computing Research Repository.

Konar, A., Chakraborty, I.G., Singh, S.J., Jain, L.C., Nagar, A.K., 2013, A Deterministic Improved Q-Learning for Path Planning of a Mobile Robot, IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 43, no. 5, pp. 1141–1153.

Lee, H., Jeong, J., 2021, Mobile Robot Path Optimization Technique Based on Reinforcement Learning Algorithm in Warehouse Environment, Applied Sciences, vol. 11, no. 3, pp. 1–16.

Rummery, G.A., Niranjan, M., 1994. On-line Q-learning Using Connectionist Systems, Technical Report, Cambridge University Engineering Department, Cambridge, UK.

Sachin, V., Hashir, A.K., Mohammed, S.O., Vishnu, R., Syed, M.F., Imthias, A.T.P., 2022, Motion Planning and Obstacle Avoidance of Mobile Robot in a Stochastic Environment Using Reinforcement Learning, Proceedings of the International Conference on Aerospace and Mechanical Engineering (ICAME), pp. 1–6.

Sichkar, V.N., 2019, Reinforcement Learning Algorithms in Global Path Planning for Mobile Robot, Proceedings of the 2019 International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM), pp. 1–5.

Song, L., Li, D.Z., 2023, Local Path Planning Via Improved Fuzzy and Q(λ)-Learning Algorithms for the Mobile Robot, Journal of Computers, vol. 34, no. 5, pp. 265–284.

Sutton, R.S., Barto, A.G., 2018, Reinforcement Learning: An Introduction, Adaptive Computation and Machine Learning, MIT Press, Cambridge Massachusetts, USA.

Tang, Z., Ma, H., 2021, An Overview of Path Planning Algorithms, IOP Conference Series: Earth and Environmental Science, vol. 804, pp. 1–10.

Viet, H.H., Kyaw, P.H., Chung, T.C., 2011, Simulation-Based Evaluations of Reinforcement Learning Algorithms for Autonomous Mobile Robot Path Planning, [in:] Park, J.J., Arabnia, H., Chang, H.B., Shon, T. (eds.), IT Convergence and Services, Lecture Notes in Electrical Engineering, vol. 108, Springer, pp. 467–476.

Watkins, C., 1989, Learning from Delayed Rewards, Ph.D. dissertation, Cambridge University, Cambridge, UK.

Downloads

Published

2025-06-25

How to Cite

Sawicki, A., & Tomera, M. (2025). SELECTED REINFORCEMENT LEARNING METHODS APPLIED TO DETERMINE THE OPTIMAL PATH OF TRANSITION. Scientific Journal of Gdynia Maritime University, (134), 71–85. https://doi.org/10.26408/134.05

Issue

Section

Articles

Categories