Normal view MARC view ISBD view

CodeQueries : Benchmarking query answering over source code

By:

Sahu, Surya Prakash

Contributor(s):

Material type: Book

BookLanguage: en. Publication details: Bengaluru IISc 2023Description: ix, 69p. col. ill. ; 29.1 cm * 20.5 cm e-Thesis 1003.KbDissertation: MTech (Res); 2023; Computer science and automationSubject(s):

DDC classification:

005 SUR

Online resources:

Click here to access online

Dissertation note: MTech (Res); 2023; Computer science and automation Summary: Software developers often make queries about the security, performance effectiveness, and maintainability of their code. Through an iterative debugging process, developers analyze the code to find answers to these queries. The process can be seen as a question-answering task that requires developers to identify code spans satisfying certain properties. Many of these queries can be answered by existing code analysis tools such as CodeQL. However, using such tools requires design, implementation, and verification efforts. In this work, we propose an alternative to the code analysis tools by formulating the task of query answering over source code as a span prediction problem. In the proposed approach, a neural model is designed to predict appropriate answer spans in a code in response to a query. The required supporting-facts to justify the predicted answers are also identified by the model. Pre-trained language models for code are fine-tuned on a newly prepared challenging dataset, CodeQueries, for query answering over source code. We demonstrate that the proposed approach performs well on the query answering over source code task when only relevant code blocks are provided as input to the model. Experiments conducted on the dataset demonstrate that the proposed neural approach is robust to noisy span labeling and can even handle code with minor syntax errors. Although large-sized code and limited training examples adversely affect the model performance, we suggest methods to address these issues. Based on our study, we believe that the proposed neural approach will be an additional tool in a developer's toolbox for query answering over source code.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Call number	Status	Date due	Barcode
E-BOOKS	JRD Tata Memorial Library	005 SUR (Browse shelf(Opens below))	Available		ET00152

Browsing JRD Tata Memorial Library shelves Close shelf browser (Hides shelf browser)

Previous		No cover image available		No cover image available	No cover image available			Next
Previous	004.697 N961 VRML sourcebook	004.75 N95 Performance issues in multidatabase systems /by K Subramanian	005.133C 19 Programming in ANSI C	005 SUR CodeQueries : Benchmarking query answering over source code	005.0151 P15 P3: An effective technique for partitioned path profiling	005.1 BAS Software architecture in practice	005.1 COR Introduction to algorithms	Next

Include bibliographical references and index.

MTech (Res); 2023; Computer science and automation

Software developers often make queries about the security, performance effectiveness, and maintainability of their code. Through an iterative debugging process, developers analyze the code to find answers to these queries. The process can be seen as a question-answering task that requires developers to identify code spans satisfying certain properties. Many of these queries can be answered by existing code analysis tools such as CodeQL. However, using such tools requires design, implementation, and verification efforts. In this work, we propose an alternative to the code analysis tools by formulating the task of query answering over source code as a span prediction problem. In the proposed approach, a neural model is designed to predict appropriate answer spans in a code in response to a query. The required supporting-facts to justify the predicted answers are also identified by the model. Pre-trained language models for code are fine-tuned on a newly prepared challenging dataset, CodeQueries, for query answering over source code. We demonstrate that the proposed approach performs well on the query answering over source code task when only relevant code blocks are provided as input to the model. Experiments conducted on the dataset demonstrate that the proposed neural approach is robust to noisy span labeling and can even handle code with minor syntax errors. Although large-sized code and limited training examples adversely affect the model performance, we suggest methods to address these issues. Based on our study, we believe that the proposed neural approach will be an additional tool in a developer's toolbox for query answering over source code.

There are no comments on this title.

to post a comment.