Research

Efficient Transformer Knowledge Distillation: A Performance Review

Published in Empirical Methods in Natural Language Processing (EMNLP), 2023

This paper discusses the distillation of long-context, efficient attention BERT-based models to yield models that are smaller, faster, and cheaper to deploy.

Recommended citation: Brown, Nathan and Williamson, Ashton and Anderson, Tahj and Lawrence, Logan. (2023). "Efficient Transformer Knowledge Distillation: A Performance Review" Empirical Methods in Natural Language Processing. https://arxiv.org/pdf/2311.13657.pdf