Title: Building and Evaluating Controllable Models for Text Simplification

 

Date: Friday, August 4, 2023

Time: 2.30pm - 4.30pm EST

Location: https://gatech.zoom.us/j/92877588273

 

Mounica Maddela

PhD Student in Computer Science

School of Interactive Computing

College of Computing

Georgia Institute of Technology

 

Committee

Dr. Wei Xu (Advisor), School of Interactive Computing, Georgia Tech

Dr. Alan Ritter, School of Interactive Computing, Georgia Tech

Dr. Mark Riedl, School of Interactive Computing, Georgia Tech

Dr. Colin Cherry, Google Research

Dr. Y-Lan Boureau, Meta AI Research

 

Abstract

Although the existing natural language generation systems (NLG) have made great progress in generating fluent text indistinguishable from human-written text, they still lack the capability to adapt to specific constraints or attributes crucial for practical applications. There has been an emerging trend in NLG to develop controllable methods for text generation that generate texts by controlling various attributes such as sentiment, formality, politeness, and topic. 

 

In this dissertation, I focus on controllable text generation for Automatic Text Simplification (ATS). ATS aims to improve the readability of texts with simpler grammar and word choices while preserving the original meaning. It is an audience-dependent task because the readability constraints vary based on the target population. Therefore, controllability is essential for the ATS systems to generate text adhering to diverse readability constraints. An ideal automatic simplification system should be able to control various attributes of the generated text such as syntactic structures, length, readability levels, and word choices that are appropriate for the situation. However, the existing simplification systems lack the capability to adapt to different readability constraints. 

 

To address these issues, I develop two novel controllable approaches for ATS: a sentence simplification system that combines linguistic rules with Transformer models to generate simplified sentences at different readability levels and a lexical simplification system that leverages human judgments of word complexity to replace complex words with simpler phrases. Finally, I propose the first supervised automatic evaluation metric for ATS, LENS, which can capture multiple simplification styles and outperforms the existing metrics in evaluating controllable simplification systems. To train and evaluate LENS, I create SIMPEVAL, the first metric evaluation benchmark that incorporates different types of simplification operations.