Normal view MARC view ISBD view

Application service resilience In cloud : (Record no. 433019)

MARC details
000 -LEADER
fixed length control field	05704nam a22002657a 4500
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field	250111b \|\|\|\|\|\|\|\| \|\|\|\| 00\| 0 eng d
041 ## - LANGUAGE CODE
Language code of text/sound track or separate title	en
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number	004.8
Item number	MAT
100 ## - MAIN ENTRY--PERSONAL NAME
Personal name	Mathews, Dhanya R
245 ## - TITLE STATEMENT
Title	Application service resilience In cloud :
Remainder of title	end-to-end perspective
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT)
Place of publication, distribution, etc	Bangalore :
Name of publisher, distributor, etc	Indian Institute of Science,
Date of publication, distribution, etc	2024.
300 ## - PHYSICAL DESCRIPTION
Extent	N/A
Accompanying material	E-Thesis
500 ## - GENERAL NOTE
General note	Includes bibliographical references.
502 ## - DISSERTATION NOTE
Dissertation note	PhD;2024;Computational and Data Sciences.
520 ## - SUMMARY, ETC.
Summary, etc	Embargo up to 10/1/2026 The idea of computing as a utility was realized with the emergence of the cloud computing paradigm. Cloud service providers offer a wide range of services that are delivered over the Internet to cloud service consumers. In its current manifestation, the Cloud services are realized over multiple logical, virtualized, and distributed resources, typically using a multi-layered architecture. The providers document the non-functional service level guarantees like availability, performance, security, etc, in Service Level Agreements (SLAs) provided to the consumer as Service Level Objectives (SLO). The wide adoption of cloud computing, compounded with the emergence of microservice architecture, has resulted in a considerable increase in the number of components involved in service delivery. Manually addressing failures in real-time is inefficient and often impossible at the cloud scale, where failures are a norm rather than an exception. Ensuring the quality of an application service, as documented in the SLA, therefore requires autonomous mechanisms to enhance cloud services' resilience. Though cloud setups rely on highly autonomous service layers for managing, provisioning, and monitoring applications, most of them focus on a specific cloud service architecture layer or consider only a particular set of faults. Any component across the cloud service stack involved in the service delivery could disrupt the SLO. Further, as cloud services use shared infrastructure, monitoring and acting on the individual service layer metrics is limiting. In such a scenario, the visibility of failure anywhere in the stack can offer effective recovery/remediation strategies; hence, an application-oriented approach that takes an end-to-end view of failures makes a case for any resiliency solution. Towards this, we propose an end-to-end service resilience framework that employs data-dependent intelligent autonomous mechanisms to deal with cloud service disruptions efficiently. The intelligence to reduce the effect of disruptions is based on understanding the complex interconnections and inter-dependencies of end-to-end components in the cloud service stack. The different cloud service abstraction layers and infrastructure sharing have resulted in increased occurrence of faults, more specifically, saturation faults. The initial phase of this work examines real-world disruption scenarios to understand the faults that could disrupt a cloud service. With ever-changing applications and environments on which they are hosted, realizing a failure repository for cloud service faults is infeasible. This makes conventional data-oriented approaches less practical and dynamic observability data-oriented methods more desirable. Towards this, the second phase of this work developed a Topology Aware Root Cause Detection Algorithm (TA-RCD) that considers the observability data from end-to-end service components and their interconnectedness. Our results from the fault injection studies show that the proposed approach performs better than the state-of-the-art RCD algorithm, at least by 2x times for Top-5 recall and 4x times for Top-3 recall, on average. To autonomously recover a service from its anomalous state, the remediation should target the root cause of anomalous behavior. The root-cause localizations, though accurate, are not restricted to a specific component because of causal effects due to service interactions. In order to identify the anomalous component, the third phase of this work developed a Topology Aware end-to-end failure Recovery framework (TA-REC) that identifies the appropriate remediation strategy for an anomaly. The anomaly scores assignment and component activity tracking in TA-REC facilitates the identification of the component and the remediation that needs to be applied to the component. For the saturation fault scenarios injected across the stack, TA-REC can identify an adequate remediation/recovery strategy compared to the state-of-the-art because of the better visibility of the origin of the failure due to the end-to-end visibility. In conclusion, this work demonstrated the usefulness of the end-to-end topology of a cloud application service to remediate anomalies that challenge the service quality efficiently. The observations prove that looking at the service as a black box restricts the development of intelligent autonomous approaches to guarantee SLOs. The proof-of-concept evaluations demonstrated that the intelligence to maintain service resilience effectively is based on an accurate understanding of the end-to-end state, as it facilitates maintaining component serviceability by targeting the cause of failure in the stack. Future work aims to evaluate both TA-RCD and TA-REC for a broader range of fault scenarios in real-life production deployments.
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element	Cloud Application Services
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element	Resilience
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element	Cloud computing
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element	Topology Aware Root Cause Detection Algorithm
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element	Service Level Objectives
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element	Distributed Systems
700 ## - ADDED ENTRY--PERSONAL NAME
Personal name	Advised by Lakshmi, J
856 ## - ELECTRONIC LOCATION AND ACCESS
Uniform Resource Identifier	https://etd.iisc.ac.in/handle/2005/6763
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Koha item type	Thesis

No items available.

Print
Save record
BIBTEX Dublin Core MARCXML MARC (non-Unicode/MARC-8) MARC (Unicode/UTF-8) MARC (Unicode/UTF-8, Standard) MODS (XML) RIS ISBD
More searches

Search for this title in:
Other Libraries (WorldCat) Other Databases (Google Scholar) Online Stores (Bookfinder.com) Open Library (openlibrary.org)