SPoC: Search-based Pseudocode to Code

Kulal, Sumith; Pasupat, Panupong; Chandra, Kartik; Lee, Mina; Padon, Oded; Aiken, Alex; Liang, Percy

Computer Science > Machine Learning

arXiv:1906.04908 (cs)

[Submitted on 12 Jun 2019]

Title:SPoC: Search-based Pseudocode to Code

Authors:Sumith Kulal, Panupong Pasupat, Kartik Chandra, Mina Lee, Oded Padon, Alex Aiken, Percy Liang

View PDF

Abstract:We consider the task of mapping pseudocode to long programs that are functionally correct. Given test cases as a mechanism to validate programs, we search over the space of possible translations of the pseudocode to find a program that passes the validation. However, without proper credit assignment to localize the sources of program failures, it is difficult to guide search toward more promising programs. We propose to perform credit assignment based on signals from compilation errors, which constitute 88.7% of program failures. Concretely, we treat the translation of each pseudocode line as a discrete portion of the program, and whenever a synthesized program fails to compile, an error localization method tries to identify the portion of the program responsible for the failure. We then focus search over alternative translations of the pseudocode for those portions. For evaluation, we collected the SPoC dataset (Search-based Pseudocode to Code) containing 18,356 programs with human-authored pseudocode and test cases. Under a budget of 100 program compilations, performing search improves the synthesis success rate over using the top-one translation of the pseudocode from 25.6% to 44.7%.

Comments:	Under submission to NeurIPS 2019
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL); Programming Languages (cs.PL); Machine Learning (stat.ML)
Cite as:	arXiv:1906.04908 [cs.LG]
	(or arXiv:1906.04908v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1906.04908

Submission history

From: Sumith Kulal [view email]
[v1] Wed, 12 Jun 2019 03:13:18 UTC (90 KB)

Computer Science > Machine Learning

Title:SPoC: Search-based Pseudocode to Code

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:SPoC: Search-based Pseudocode to Code

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators