Projects

Here are a few of the projects and publications I’ve worked on.

Relambda (1 September 2019)

Relambda is a fast bytecode-compiling Unlambda interpreter I wrote to experiment with bytecode compilation and other concepts I learned in the process of writing a Scheme interpreter. I also have a blog post describing the design.

When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation? (18 April 2018)

Ye Qi, Devendra Singh Sachan, Matthieu Felix, Sarguna Padmanabhan, Graham Neubig: When and Why Are Pre-Trained Word Embeddings Useful for Neural Machine Translation? NAACL-HLT (2) 2018: 529-535

The performance of Neural Machine Translation (NMT) systems often suffers in low-resource scenarios where sufficiently large-scale parallel corpora cannot be obtained. Pre-trained word embeddings have proven to be invaluable for improving performance in natural language analysis tasks, which often suffer from paucity of data. However, their utility for NMT has not been extensively explored. In this work, we perform five sets of experiments that analyze when we can expect pre-trained word embeddings to help in NMT tasks. We show that such embeddings can be surprisingly effective in some cases – providing gains of up to 20 BLEU points in the most favorable setting.

XNMT: The eXtensible Neural Machine Translation Toolkit (1 March 2018)

Graham Neubig, Matthias Sperber, Xinyi Wang, Matthieu Felix, Austin Matthews, Sarguna Padmanabhan, Ye Qi, Devendra Singh Sachan, Philip Arthur, Pierre Godard, John Hewitt, Rachid Riad, Liming Wang: XNMT: The eXtensible Neural Machine Translation Toolkit. AMTA (1) 2018: 185-192

This paper describes XNMT, the eXtensible Neural Machine Translation toolkit. XNMT distinguishes itself from other open-source NMT toolkits by its focus on modular code design, with the purpose of enabling fast iteration in research and replicable, reliable results. In this paper we describe the design of XNMT and its experiment configuration system, and demonstrate its utility on the tasks of machine translation, speech recognition, and multi-tasked machine translation/parsing. XNMT is available open-source on GitHub.

IPAfy (27 July 2017)

IPAfy is a Firefox extension that lets you convert Enlish-language web pages pages to International Phonetic Alphabet, or IPA.

Textually Enriched Neural Module Networks for Visual Question Answering (13 June 2017)

Chandu, Khyathi Raghavi, Mary Arpita Pyreddy, Matthieu Felix, and Narendra Nath Joshi. “Textually Enriched Neural Module Networks for Visual Question Answering.” arXiv preprint arXiv:1809.08697 (2018).

Problems at the intersection of language and vision, like visual question answering, have recently been gaining a lot of attention in the field of multi-modal machine learning as computer vision research moves beyond traditional recognition tasks. There has been recent success in visual question answering using deep neural network models which use the linguistic structure of the questions to dynamically instantiate network layouts. In the process of converting the question to a network layout, the question is simplified, which results in loss of information in the model. In this paper, we enrich the image information with textual data using image captions and external knowledge bases to generate more coherent answers. We achieve 57.1% overall accuracy on the test-dev open-ended questions from the visual question answering (VQA 1.0) real image dataset.

Trade sign autocorrelation on electronic markets (28 November 2016)

This project aims to study the evolutions in the structure of transaction sign autocorrelation on electronic stock markets between 2000 and 2013. Previous works on this topic, based on data from the first half of the 2000’s, describe a very long memory of these signs (with a power-law decrease of autocorrelation). It is then natural to ask whether the massive rise of high-frequency trading and automated execution strategies have induced significant evolutions of this structure.