Research

Pula: Training Large Language Models for Setswana

Published in NAACL 2025, 2025

Developed in partnership with the DSFSI group at The University of Pretoria, this work introduces Pula, the first suite of LLMs built for Setswana; Marothodi, the largest Setswana pre-training corpus; and Medupi, the first extensive Setswana instruction-tuning dataset.

Recommended citation: Brown, Nathan and Marivate, Vukosi (2025). "Pula: Training Large Language Models for Setswana" NAACL 2025 https://aclanthology.org/2025.naacl-long.338/

Efficient Transformer Knowledge Distillation: A Performance Review

Published in Empirical Methods in Natural Language Processing (EMNLP), 2023

This paper discusses the distillation of long-context, efficient attention BERT-based models to yield models that are smaller, faster, and cheaper to deploy.

Recommended citation: Brown, Nathan and Williamson, Ashton and Anderson, Tahj and Lawrence, Logan. (2023). "Efficient Transformer Knowledge Distillation: A Performance Review" Empirical Methods in Natural Language Processing. https://arxiv.org/pdf/2311.13657.pdf

Hospital Event Reports

Published in N/A, 2022

Fine-tuned embeddings and BERT-style models for tasks such as text clustering, sentiment analysis, and named entity recognition on hospital reports.