Institutsversammlung am 15.06.2011 um 11:30 h, Raum M 12.21
Statistical Machine Translation with Weighted Grammars
Talk by Matthias Büchse, PhD student in computer science, Technische Universität Dresden
Weighted grammars have a firm place in the research of statistical machine translation (SMT). Recent examples of such grammars are synchronous context-free grammars (Chiang, 2007), synchronous tree-insertion grammars (Nesson, Shieber, and Rush, 2006), and synchronous tree-adjoining grammars (DeNeefe and Knight, 2009). Each of the respective systems achieves significant BLEU scores.
One benefit of grammar-based models is that many results from formal-language theory and automata theory can be transferred immediately to SMT, or with little effort. Examples of this transfer are the problems of intersecting languages and finding shortest paths, which frequently occur in decoding. In addition, translation systems specified in such a framework can run on any platform which offers a corresponding toolkit.
In this talk we first briefly recall four main tasks of SMT: modeling, training, decoding, and evaluation. Then, guided by an example, we approach these tasks from the perspective of weighted grammars. If time permits, we quickly review the state of the art in this setting.
David Chiang, 2007. Hierarchical phrase-based translation. In Comp. Ling. 33(2):201–228. http://www.mitpressjournals.org/doi/pdf/10.1162/coli.2007.33.2.201
Rebecca Nesson, Stuart M. Shieber, and Alexander Rush, 2006. Induction of probabilistic synchronous tree-insertion grammars for machine translation. In Proc. AMTA 2006. http://www.eecs.harvard.edu/~shieber/Biblio/Papers/Nesson-2006-IPS.pdf
Steve DeNeefe and Kevin Knight, 2009. Synchronous Tree Adjoining Machine Translation. In Proc. EMNLP 2009. http://www.isi.edu/natural-language/mt/adjoin09.pdf