Crowdsourcing for Human and Machine Translation

Sunday, February 14, 2016: 10:00 AM-11:30 AM
Harding (Marriott Wardman Park)
Chris Callison-Burch, University of Pennsylvania, Philadelphia, PA
Modern approaches to machine translation are data-driven. Statistical models are trained using bilingual parallel text One advantage of statistical translation models is that they are language independent, provided sufficient data. Unfortunately, most of the world's languages do not have large amounts of training data. I will detail my experiments using Amazon Mechanical Turk to create crowd-sourced translations for "low resource" languages that we do not have training data for.