Single and multi-agent finite horizon reinforcement learning algorithms for smart grids

By: Contributor(s): Material type: TextTextLanguage: Eng. Publication details: Bangalore : Indian Institute of Science, 2024. Description: ix, 100p. : col. ill. eThesis 3.936 MbSubject(s): DDC classification:
  • 621.31  VIV
Online resources: Dissertation note: PhD;2024:Computer Science and Automation. Summary: In this thesis, we study sequential decision-making under uncertainty in the context of smart grids using reinforcement learning. The underlying mathematical model for reinforcement learning algorithms are Markov Decision Processes. A smart grid is essentially a concept for efficient electricity management using various technologies. We consider different models of smart grids involving single and multiple decision-making agents. We then develop reinforcement learning algorithms that can be applied to these models for efficient energy management. We also rigorously prove the convergence and stability of these algorithms. We then demonstrate the efficiency of these algorithms on the smart grid models we considered. Additionally, we run these algorithms on different randomly generated Markov Decision Processes to establish their correctness and convergence. We give a brief description of various studies given in this thesis. 1. Finite Horizon Q-learning algorithm for smart grids In this study, we develop a model of smart grid including different components like a main grid, microgrid with battery, renewal energy, and microcontroller. Subsequently, we define the problem of energy management in this model. This is modeled as a finite horizon Markov decision process. To address the complex decision-making process for energy management in the finite horizon Markov Decision Process, we develop a Q-learning algorithm. We apply this algorithm to our model effectively and demonstrate its performance. Additionally, we give rigorous mathematical proof establishing the stability and correctness of the algorithm. Our analysis of stability and convergence is purely based on ordinary differential equations. We also demonstrate the performance of our algorithm on different Markov Decision Processes generated randomly. 2. Finite Horizon Minimax Q-learning algorithm for smart grids. In this work, we developed a comprehensive model for smart grid that takes into account the the competition between two microgrids. Each microgrid have a battery, renewal energy, and microcontroller. Stochastic games are an important framework to capture the competitive environment. It is an extension of the Markov Decision Process by including multiple decision makers. It can also be viewed as an extension of games including a state factor. We model the competition between the two microgrids in our smart grid model as a finite horizon stochastic game. The interaction between the microgrids happens over a finite number of stages. We aim to solve the equilibrium of this competitive interaction. To this interest, the minimax concept is used to capture instantaneous interaction. Subsequently, we develop a finite horizon minimax Q-learning algorithm to capture the long-term equilibrium of the competition between two microgrids. The performance of the algorithm is effectively demonstrated on smart grid setup. Additionally, we demonstrate the correctness and convergence of the algorithm on randomly generated stochastic games. Furthermore, a rigorous mathematical proof of the stability and convergence of the algorithm is given. 3. Finite Horizon SOR Q-learning In this final part of our study, we proposed a generalization of the finite horizon problem using discounting and proposed an improvement of the finite horizon Q-learning algorithm for this problem. The rate of convergence of a reinforcement learning algorithm is an important parameter of its performance. There are techniques used in the literature to improve the rate of convergence of reinforcement learning algorithms. One of them is successive over-relaxation. This was originally used in linear algebra to improve the performance of the Gauss-Siedel iterative scheme used for solving linear system of equations. We apply this technique in the finite horizon Q-learning of discounted problems to get a better algorithm that has better asymptotic performance.
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Current library Call number URL Status Date due Barcode
Thesis Thesis JRD Tata Memorial Library 621.31 VIV (Browse shelf(Opens below)) Link to resource Not for loan ET00654

Includes bibliographical references.

PhD;2024:Computer Science and Automation.

In this thesis, we study sequential decision-making under uncertainty in the context of smart grids using reinforcement learning. The underlying mathematical model for reinforcement learning algorithms are Markov Decision Processes. A smart grid is essentially a concept for efficient electricity management using various technologies. We consider different models of smart grids involving single and multiple decision-making agents. We then develop reinforcement learning algorithms that can be applied to these models for efficient energy management. We also rigorously prove the convergence and stability of these algorithms. We then demonstrate the efficiency of these algorithms on the smart grid models we considered. Additionally, we run these algorithms on different randomly generated Markov Decision Processes to establish their correctness and convergence. We give a brief description of various studies given in this thesis. 1. Finite Horizon Q-learning algorithm for smart grids In this study, we develop a model of smart grid including different components like a main grid, microgrid with battery, renewal energy, and microcontroller. Subsequently, we define the problem of energy management in this model. This is modeled as a finite horizon Markov decision process. To address the complex decision-making process for energy management in the finite horizon Markov Decision Process, we develop a Q-learning algorithm. We apply this algorithm to our model effectively and demonstrate its performance. Additionally, we give rigorous mathematical proof establishing the stability and correctness of the algorithm. Our analysis of stability and convergence is purely based on ordinary differential equations. We also demonstrate the performance of our algorithm on different Markov Decision Processes generated randomly. 2. Finite Horizon Minimax Q-learning algorithm for smart grids. In this work, we developed a comprehensive model for smart grid that takes into account the the competition between two microgrids. Each microgrid have a battery, renewal energy, and microcontroller. Stochastic games are an important framework to capture the competitive environment. It is an extension of the Markov Decision Process by including multiple decision makers. It can also be viewed as an extension of games including a state factor. We model the competition between the two microgrids in our smart grid model as a finite horizon stochastic game. The interaction between the microgrids happens over a finite number of stages. We aim to solve the equilibrium of this competitive interaction. To this interest, the minimax concept is used to capture instantaneous interaction. Subsequently, we develop a finite horizon minimax Q-learning algorithm to capture the long-term equilibrium of the competition between two microgrids. The performance of the algorithm is effectively demonstrated on smart grid setup. Additionally, we demonstrate the correctness and convergence of the algorithm on randomly generated stochastic games. Furthermore, a rigorous mathematical proof of the stability and convergence of the algorithm is given. 3. Finite Horizon SOR Q-learning In this final part of our study, we proposed a generalization of the finite horizon problem using discounting and proposed an improvement of the finite horizon Q-learning algorithm for this problem. The rate of convergence of a reinforcement learning algorithm is an important parameter of its performance. There are techniques used in the literature to improve the rate of convergence of reinforcement learning algorithms. One of them is successive over-relaxation. This was originally used in linear algebra to improve the performance of the Gauss-Siedel iterative scheme used for solving linear system of equations. We apply this technique in the finite horizon Q-learning of discounted problems to get a better algorithm that has better asymptotic performance.

There are no comments on this title.

to post a comment.

                                                                                                                                                                                                    Facebook    Twitter

                             Copyright © 2024. J.R.D. Tata Memorial Library, Indian Institute of Science, Bengaluru - 560012

                             Contact   Phone: +91 80 2293 2832