Normal view MARC view ISBD view

Learning from limited and imperfect data

By:

Rangwani, Harsh

Contributor(s):

Advised by Venkatesh Babu, R

Material type: Text

TextLanguage: en Publication details: Bangalore : Indian Institute of Science, 2024.Description: xi, 302 p. : col. ill. e-Thesis 58.31 MbSubject(s):

DDC classification:

006.3 RAN

Online resources:

Click here to access online

Dissertation note: PhD;2024;Computational and Data Sciences Summary: Deep Neural Networks have demonstrated orders of magnitude improvement in capabilities over the years after AlexNet won the ImageNet challenge in 2012. One of the major reasons for this success is the availability of large-scale, well-curated datasets. These datasets (e.g. ImageNet, MSCOCO, etc.) are often manually balanced across categories (classes) to facilitate learning of all the categories. This curation process is often expensive and requires throwing away precious annotated data to balance the frequency across classes. This is because the distribution of data in the world (e.g., internet, etc.) significantly differs from the well-curated datasets and is often over-populated with samples from common categories. The algorithms designed for well-curated datasets perform suboptimally when used for learning from imperfect datasets with long-tailed imbalances and distribution shifts.To expand the use of deep models, it is essential to overcome the labor-intensive curation process by developing robust algorithms that can learn from diverse, real-world data distributions. Toward this goal, we develop practical algorithms for Deep Neural Networks which can learn from limited and imperfect data present in the real world. This thesis is divided into four segments, each covering a scenario of learning from limited or imperfect data. The first part of the thesis focuses on Learning Generative Models from Long-Tail Data, where we mitigate the mode-collapse and enable diverse aesthetic image generations for tail (minority) classes. In the second part, we enable effective generalization on tail classes through Inductive Regularization schemes, which allow tail classes to generalize as effectively as the head classes without requiring explicit generation of images. In the third part, we develop algorithms for Optimizing Relevant Metrics for learning from long-tailed data with limited annotation (semi-supervised), followed by the fourth part, which focuses on the Efficient Domain Adaptation of the model to various domains with zero to very few labeled samples.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Call number	URL	Status	Date due	Barcode
Thesis	JRD Tata Memorial Library	006.3 RAN (Browse shelf(Opens below))	Link to resource	Not for loan		ET00825

Includes bibliographical references

PhD;2024;Computational and Data Sciences

Deep Neural Networks have demonstrated orders of magnitude improvement in capabilities over the years after AlexNet won the ImageNet challenge in 2012. One of the major reasons for this success is the availability of large-scale, well-curated datasets. These datasets (e.g. ImageNet, MSCOCO, etc.) are often manually balanced across categories (classes) to facilitate learning of all the categories. This curation process is often expensive and requires throwing away precious annotated data to balance the frequency across classes. This is because the distribution of data in the world (e.g., internet, etc.) significantly differs from the well-curated datasets and is often over-populated with samples from common categories. The algorithms designed for well-curated datasets perform suboptimally when used for learning from imperfect datasets with long-tailed imbalances and distribution shifts.To expand the use of deep models, it is essential to overcome the labor-intensive curation process by developing robust algorithms that can learn from diverse, real-world data distributions. Toward this goal, we develop practical algorithms for Deep Neural Networks which can learn from limited and imperfect data present in the real world. This thesis is divided into four segments, each covering a scenario of learning from limited or imperfect data. The first part of the thesis focuses on Learning Generative Models from Long-Tail Data, where we mitigate the mode-collapse and enable diverse aesthetic image generations for tail (minority) classes. In the second part, we enable effective generalization on tail classes through Inductive Regularization schemes, which allow tail classes to generalize as effectively as the head classes without requiring explicit generation of images. In the third part, we develop algorithms for Optimizing Relevant Metrics for learning from long-tailed data with limited annotation (semi-supervised), followed by the fourth part, which focuses on the Efficient Domain Adaptation of the model to various domains with zero to very few labeled samples.

There are no comments on this title.

to post a comment.