An empirical analysis of compute-optimal large language model training

NeurIPS2022

Share