CoditT5: Pretraining for Source Code and Natural Language Editing

Jiyang Zhang; Sheena Panthaplackel; Pengyu Nie; Junyi Jessy Li; Milos   Gligoric

by Jiyang Zhang, Sheena Panthaplackel, Pengyu Nie, Junyi Jessy Li, Milos Gligoric

Released as a article .

2022

Abstract

Pretrained language models have been shown to be effective in many software-related generation tasks; however, they are not well-suited for editing tasks as they are not designed to reason about edits. To address this, we propose a novel pretraining objective which explicitly models edits and use it to build CoditT5, a large language model for software-related editing tasks that is pretrained on large amounts of source code and natural language comments. We fine-tune it on various downstream editing tasks, including comment updating, bug fixing, and automated code review. By outperforming standard generation-based models, we demonstrate the generalizability of our approach and its suitability for editing tasks. We also show how a standard generation model and our edit-based model can complement one another through simple reranking strategies, with which we achieve state-of-the-art performance for the three downstream editing tasks.
In text/plain format

Archived Files and Locations

application/pdf 299.3 kB
file_sqfxyehdkzfv7icz2fngypibcq arxiv.org (repository)
web.archive.org (webarchive)

Read Archived PDF

Preserved and Accessible

Type article
Stage

submitted

Date 2022-09-14
Version v2
Language en ^?

arXiv 2208.05446v2

Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)

Cite This

BibTeX
CSL-JSON
MLA
Harvard

Lookup Links

Worldcat
wikidata.org
CORE.ac.uk
Semantic Scholar
Google Scholar

Catalog Record
Revision: b6ef08a4-0cce-4aaa-bba8-286b3072d288
API URL: JSON

Edit Metadata View History

CoditT5: Pretraining for Source Code and Natural Language Editing release_nivresaxv5d2zoaumxatww6vyq

Abstract

Archived Files and Locations

CoditT5: Pretraining for Source Code and Natural Language Editing `release_nivresaxv5d2zoaumxatww6vyq`