CoT on BBH - Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

CoT on BBH:M. Suzgun et al., ‘Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them’. arXiv, Oct. 17, 2022. Available: http://arxiv.org/abs/2210.09261 Method Applying chain-of-thought (CoT) prompting to BIG-Bench Hard tasks Evaluate few-shot performance via standard “answer-only” prompting and chain-of-thought prompting on BIG-Bench Hard Benchmark Results/Analysis/Findings Benchmark: BIG-Bench Hard (BBH). These are the task for which prior language model evaluations did not outperform the average human-rater. many tasks in BBH require multi-step reasoning...

<span title='2022-11-13 00:00:00 +0000 UTC'>2022-11-13</span>&nbsp;·&nbsp;4 min&nbsp;·&nbsp;Cong Chan

Efficient Training of Language Models to Fill in the Middle

Bavarian, Mohammad, et al. Efficient Training of Language Models to Fill in the Middle. arXiv:2207.14255, arXiv, 28 July 2022. arXiv.org, http://arxiv.org/abs/2207.14255. data: https://www.github.com/openai/human-eval-infilling TL:DR Autoregressive language models can effectively learn to infill text by moving a span of text from the middle of a document to its end, without harming the original generative capability. The training models with this technique, called fill-in-the-middle (FIM), is useful, simple, and efficient, and should be used by default in future autoregressive language models....

<span title='2022-11-11 00:00:00 +0000 UTC'>2022-11-11</span>&nbsp;·&nbsp;8 min&nbsp;·&nbsp;Cong Chan