Codex - Evaluating Large Language Models Trained on Code
Codex:M. Chen et al., ‘Evaluating Large Language Models Trained on Code’. arXiv, Jul. 14, 2021. Available: http://arxiv.org/abs/2107.03374 Intro Codex, a GPT language model finetuned on publicly available code from GitHub Task: docstring-conditional code generation Method Codex: fine-tune GPT3 models containing up to 12B parameters on code to produce Codex. Codex-S: fine-tune Codex on standalone, correctly implemented functions. Inference: assemble each HumanEval problem into a prompt consisting of a header, a signature, and a docstring. We use nucleus sampling (Holtzman et al., 2020) with top p = 0.95 for all sampling evaluation in this work ...