Normal view MARC view ISBD view

Barrier function inspired reward shaping in reinforcement learning (Record no. 432354)

MARC details
000 -LEADER
fixed length control field	02467nam a2200217 4500
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field	240719b \|\|\|\|\|\|\|\| \|\|\|\| 00\| 0 eng d
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number	006.31
Item number	RAN
100 ## - MAIN ENTRY--PERSONAL NAME
Personal name	Ranjan, Abhishek.
245 ## - TITLE STATEMENT
Title	Barrier function inspired reward shaping in reinforcement learning
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT)
Place of publication, distribution, etc	Bangalore :
Name of publisher, distributor, etc	Indian Institute of Science,
Date of publication, distribution, etc	2024.
300 ## - PHYSICAL DESCRIPTION
Extent	xi, 56 p. :
Other physical details	col. ill.
Accompanying material	e- Thesis.
Size of unit	20.64 Mb.
500 ## - GENERAL NOTE
General note	Includes bibliography.
502 ## - DISSERTATION NOTE
Dissertation note	MSc(Res);2024;Computer Science and Automation.
520 ## - SUMMARY, ETC.
Summary, etc	Reinforcement Learning (RL) has progressed from simple control tasks to complex real-world challenges with large state spaces. During initial iterations of training in most Reinforcement Learning (RL) algorithms, agents perform a significant number of random exploratory steps, which in the real world limits the practicality of these algorithms as this can lead to potentially dangerous behaviour. Hence, safe exploration is a critical issue when applying RL algorithms in the real world. Although RL excels in solving these challenging problems, the time required for convergence during training remains a significant limitation. Various techniques have been proposed to mitigate this issue, and reward shaping has emerged as a popular solution. However, most existing reward-shaping methods rely on value functions, which can pose scalability challenges as the environment’s complexity grows. Our research proposes a novel framework for reward shaping inspired by Barrier Functions, which is safety-oriented, intuitive, and easy to implement for any environment or task. To evaluate the effectiveness of our proposed reward formulations, we present our results on a challenging Safe Reinforcement Learning benchmark - the Open AI Safety Gym. We have conducted experiments on various environments, including CartPole, Half-Cheetah, Ant, and Humanoid. Our results demonstrate that our method leads to 1.4-2.8 times faster convergence and as low as 50-60% actuation effort compared to the vanilla reward. Moreover, our formulation has a theoretical basis for safety, which is crucial for real-world applications. In a sim-to-real experiment with the Go1 robot, we demonstrated better control and dynamics of the bot with our reward framework.
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element	Reinforcement learning
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element	Robotics
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element	Barrier function
700 ## - ADDED ENTRY--PERSONAL NAME
Personal name	Advised by Kolathaya, Shishir N Y.
856 ## - ELECTRONIC LOCATION AND ACCESS
Uniform Resource Identifier	https://etd.iisc.ac.in/handle/2005/6558
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Koha item type	Thesis

No items available.

Print
Save record
BIBTEX Dublin Core MARCXML MARC (non-Unicode/MARC-8) MARC (Unicode/UTF-8) MARC (Unicode/UTF-8, Standard) MODS (XML) RIS ISBD
More searches

Search for this title in:
Other Libraries (WorldCat) Other Databases (Google Scholar) Online Stores (Bookfinder.com) Open Library (openlibrary.org)