Talk Info: Zhiting Hu

Title: Text Generation with No (Good) Data: New Reinforcement Learning and Causal Frameworks

Abstract: Text generation systems, especially powered by massive pretrained language models (LMs), are increasingly used in real applications. However, the usual maximum-likelihood (MLE) based training or finetuning relies on clean text examples for direct supervision. The approach is not applicable to many emerging problems where we have access only to noisy or weak-supervision data, or biased data with spurious correlations, or no data at all. Such problems include generating text prompts for massive LMs, generating adversarial attacks, and various controllable generation tasks, etc. In this talk, I will introduce new modeling and learning frameworks for text generation, including: (1) A new reinforcement learning (RL) formulation for training with arbitrary reward functions. Building upon the latest advances in soft Q-learning, the approach alleviates the previous fundamental challenges of sparse reward and large action space, resulting in a simple and efficient algorithm and strong results on various problems; (2) A causal framework for controllable generation that offers a new lens for text modeling from the principled causal perspective. It allows us to eliminate generation biases inherited from training data using rich causality tools (e.g., intervention, counterfactuals). We show its significant improvement for both learning unbiased controllable generation models and de-biasing existing pretrained LMs.

Bio: Zhiting Hu is an incoming Assistant Professor in Halicioglu Data Science Institute at UC San Diego and a visiting research scientist at Amazon. He received his Bachelor’s degree in Computer Science from Peking University, and his Ph.D. in Machine Learning from Carnegie Mellon University. His research interests lie in the broad area of machine learning, natural language processing, ML systems, healthcare and other application domains. In particular, He is interested in principles, methodologies, and systems of training AI agents with all types of experiences (data, knowledge, rewards, adversaries, lifelong interplay, etc). His research was recognized with best demo nomination at ACL2019 and outstanding paper award at ACL2016.

Video:

Slides: