Colin Raffel Headshot

Colin Raffel

I am an Assistant Professor in the Department of Computer Science at the University of North Carolina, Chapel Hill. I also spend one day a week as a Faculty Researcher at Hugging Face. Much of my recent research focuses on machine learning algorithms for learning from limited labeled data, including semi-supervised, unsupervised, and transfer learning.

Group members

(online and offline)

Nikhil Kandpal, PhD student at UNC
Derek Tam, PhD student at UNC (co-advised with Mohit Bansal)
Michael Matena, PhD student at UNC
Zhenlin Xu, PhD student at UNC (co-advised with Marc Niethammer)
Haokun Liu, PhD student at UNC
Anisha Mascarenhas, Master's student at UNC
Muqeeth Mohammed, Master's student at UNC
Vishal Baskaran, Master's student at UNC
Mansi Sakarvadia, Undergraduate at UNC
Tenghao Huang, Undergraduate at UNC
Ellie Evans, Undergraduate at UNC
Monty Evans, Undergraduate at UNC

Recent publications

(full list)

What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
Thomas Wang*, Adam Roberts*, Daniel Hesslow, Teven Le Scao, Hyung Won Chung, Iz Beltagy, Julien Launay, and Colin Raffel
39th International Conference on Machine Learning (ICML), 2022 (to appear).

Deduplicating Training Data Mitigates Privacy Risks in Language Models
Nikhil Kandpal, Eric Wallace, and Colin Raffel
39th International Conference on Machine Learning (ICML), 2022 (to appear).

PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
Stephen H. Bach*, Victor Sanh*, Zheng-Xin Yong, Albert Webson, Colin Raffel, Nihal V. Nayak, Abheesht Sharma, Taewoon Kim, M Saiful Bari, Thibault Fevry, Zaid Alyafeai, Manan Dey, Andrea Santilli, Zhiqing Sun, Srulik Ben-David, Canwen Xu, Gunjan Chhablani, Han Wang, Jason Alan Fries, Maged S. Al-shaibani, Shanya Sharma, Urmish Thakker, Khalid Almubarak, Xiangru Tang, Dragomir Radev, Mike Tian-Jian Jiang, and Alexander M. Rush
60th Annual Meeting of the Association for Computational Linguistics (ACL), Demo Track, 2022.

Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning
Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang, Mohit Bansal, and Colin Raffel
arXiv preprint arXiv:2205.05638, 2022.

Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh*, Albert Webson*, Colin Raffel*, Stephen H. Bach*, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Stella Biderman, Leo Gao, Tali Bers, Thomas Wolf, and Alexander M. Rush
10th International Conference on Learning Representations (ICLR), 2022.

Scaling Up Models and Data with t5x and seqio
Adam Roberts, Hyung Won Chung, Anselm Levskaya, Gaurav Mishra, James Bradbury, Daniel Andor, Sharan Narang, Brian Lester, Colin Gaffney, Afroz Mohiuddin, Curtis Hawthorne, Aitor Lewkowycz, Alex Salcianu, Marc van Zee, Jacob Austin, Sebastian Goodman, Livio Baldini Soares, Haitang Hu, Sasha Tsvyashchenko, Aakanksha Chowdhery, Jasmijn Bastings, Jannis Bulian, Xavier Garcia, Jianmo Ni, Andrew Chen, Kathleen Kenealy, Jonathan H. Clark, Stephan Lee, Dan Garrette, James Lee-Thorp, Colin Raffel, Noam Shazeer, Marvin Ritter, Maarten Bosma, Alexandre Passos, Jeremy Maitin-Shepard, Noah Fiedel, Mark Omernick, Brennan Saeta, Ryan Sepassi, Alexander Spiridonov, Joshua Newlan, and Andrea Gesmundo
arXiv preprint arXiv:2203.17189, 2022.

ByT5: Towards a token-free future with pre-trained byte-to-byte models
Linting Xue*, Aditya Barua*, Noah Constant*, Rami Al-Rfou*, Sharan Narang, Mihir Kale, Adam Roberts, and Colin Raffel
Transactions of the Association for Computational Linguistics (TACL), 2022.

Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP
Sabrina J. Mielke, Zaid Alyafeai, Elizabeth Salesky, Colin Raffel, Manan Dey, Matthias Gallé, Arun Raja, Chenglei Si, Wilson Y. Lee, Benoît Sagot, and Samson Tan
arXiv preprint arxiv:2112.10508, 2021.

Training Neural Networks with Fixed Sparse Masks
Yi-Lin Sung*, Varun Nair*, and Colin Raffel
Neural Information Processing Systems 35 (NeurIPS), 2021.

Merging Models with Fisher-Weighted Averaging
Michael Matena and Colin Raffel
arXiv preprint arxiv:2111.09832, 2021.

Improving and Simplifying Pattern Exploiting Training
Derek Tam*, Rakesh R Menon*, Mohit Bansal, Shashank Srivastava, and Colin Raffel
2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.

Do Transformer Modifications Transfer Across Implementations and Applications?
Sharan Narang, Hyung Won Chung, Yi Tay, William Fedus, Thibault Fevry, Michael Matena, Karishma Malkan, Noah Fiedel, Noam Shazeer, Zhenzhong Lan, Yanqi Zhou, Wei Li, Nan Ding, Jake Marcus, Adam Roberts, and Colin Raffel
2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.

On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition
Ching-Yuan Bai, Hsuan-Tien Lin, Colin Raffel, and Wendy Chih-wen Kan
27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2021.

Extracting Training Data from Large Language Models
Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, and Colin Raffel
30th USENIX Security Symposium, 2021.

Talks

Less Data, More ___? Data Augmentation and Semi-Supervised Learning for Natural Language Processing at 60th Annual Meeting of the Association for Computational Linguistics Tutorials, 2022.

A call to build models like we build open-source software at Cornell University Artificial Intelligence Seminar, Georgia Tech NLP Seminar, UMass Amherst Machine Learning & Friends Lunch, UC Santa Barbara NLP Seminar, 2021.

A few possibly controversial opinions about large language models at Carnegie Mellon University Language Technologies Topical Seminar, 2021.

The Sweet Lesson at SustaiNLP Workshop, 2021.

What do language models learn from language modeling? at Stanford University CS 330 Lecture, 2021.

How and why should(n't) we scale machine learning? at IBM AI Hardware Forum Keynote, 2021.

A better way to get language models to do what you ask at AKBC 2021 Unstructured and Structured Knowledge Bases Workshop and Cohere.ai, 2021.

Scaling up Models and Data at CIFAR Deep Learning and Reinforcement Learning Summer School, Nepal Winter School in AI, and Advanced Language Processing Winter School, 2021.

Explicit and Implicit Entropy Minimization in Proxy-Label-Based Semi-Supervised Learning at CVPR Workshop on Learning with Limited and Imperfect Data, 2021.

The benefits of unified frameworks for language understanding at Conceptual Understanding of Deep Learning Workshop, 2021.

T5 and large language models: The good, the bad, and the ugly at Stanford University CS 224n Lecture, CU Boulder Applied Mathematics Colloqium, Twitter Machine Learning Seminar, Google Graduate Symposium & TTIC NLP Seminar, 2020.

Responsible publication: NLP case study at Navigating the Broader Impacts of AI Research Workshop Panel, 2020.

What Can MIR Learn From Transfer Learning in NLP? at NLP for Music and Audio Workshop Keynote, 2020.

Transfer Learning for NLP: T5 and Beyond at Montreal Institute for Learning Algorithms Tea Talk & Spotify Research Seminar, 2020.

Answering Questions by Querying the Implicit Knowledge Base Inside T5 at AKBC 2020 Unstructured and Structured Knowledge Bases Workshop, 2020.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer at Allen Institute for Artificial Intelligence & New York University CILVR Seminar, 2019.

Outskirts of Deep Generative Modeling at Faculty Job Talk, 2019.

Why are GANs Interesting? at New York University CILVR Seminar, 2018.

A Few Unusual Autoencoders at Vector Institute, New York University & San Francisco State University, 2018.

Leveraging MIDI Files for Music Information Retrieval at 18th International Society for Music Information Retrieval Conference Tutorials, 2017.

Doing Strange Things with Attention at AI With The Best & 1st USF Data Institute Conference, 2017.

The Lakh MIDI Dataset: How It Was Made, and How to Use It at BISH Bash Meetup, Centre for Digital Music Seminar & Jukedeck Lunch and Learn, 2016.

Learning-Based Methods for Comparing Sequences, with Applications to Audio-to-MIDI Alignment and Matching at 2nd ICML Machine Learning for Music Discovery Workshop, 2016.

Accelerating Large-Scale Sequence Retrieval with Convolutional Networks at IIT Bombay Electrical Engineering Seminar, 2015.

Learning Efficient Representations for Sequence Retrieval at Boston Data Festival, 2015.

Using Convolutional Networks (with Attention) for Orders-of-Magnitude Speedup of DTW-Based Sequence Retrieval at Spotify Machine Learning Seminar, 2015.

Recurrent Networks in Lasagne at Mount Sinai Hammer Lab Seminar, 2015.

Lasagne Tutorial at Next.ml Boston, 2015.

Theano Tutorial at Next.ml Boston, 2015.

mir_eval at Objective Evaluation in Semantic Audio Analysis and Processing Panel at the 138th Convention of the Audio Engineering Society, 2015.

Large-Scale Content-Based Matching of Audio and MIDI Data at Stanford University DSP Seminar, 2015.

Advances and Challenges in Large-Scale Music Information Retrieval at Digital Music Research Network+8, 2013.

Quantifying Rhythmic Synchrony at Midwestern Music Cognition Symposium, 2013.

A Sequential Approach to Musical Event Detection at Carnegie Mellon University Music and Technology Seminar, 2011.

ROW-mp3: An Enhanced MP3-Compatible Audio Codec at Stanford University DSP Seminar, 2010.

An Effective Model of Bucket-Brigade Device-Based Audio Circuits at Stanford University DSP Seminar, 2010.

Voltage-Controlled Resistance: Modulate Anything at Circuitastrophe Circuit Bending Music Festival, 2008.