Impressions from NAACL

2015, Denver, Colorado

The highlight of this conference was an invited talk about applying computational linguistics to pragmatics and social science. Central in computational linguistics is the use of annotated data to evaluate models. This talk presents three results; what they have in common is the clever use of readily available data such as website comments, ratings, and debate transcripts. Topics in the talk include why certain movie quotes become popular, why tweets by influential people are re-tweeted, and the subtle ways in which hedging in a debate reveals the authority of the speaker.
Video recording of invited talk

The main reason to go to Denver was for Corina and I to present our paper on literariness prediction and listen to the other presentations at the Computational Linguistics for Literature workshop. Matthew Jockers presented his ongoing work on plotting the sentiment in novels. The plots are based on the counts of positive and negative words. Using various techniques such as Fourier transforms, the plots of different novels can be compared in order to investigate whether there exist a small number of archetypical plot shapes.

The best paper presentation was on a system to analyze rhyme in poetry. The paper presents a formalism to define a wide variety of types of rhyme, as well as an open source implementation, RhymeDesign, for applying the formalism to a corpus of poems.
Two talks in the main conference caught my interest. The first is “Not All Character N-grams Are Created Equal.” This paper investigates one the most successful features in authorship attribution: strings of characters of fixed length, without regard for word boundaries. It turns out different kinds of character n-grams can be distinguished: those at the beginning, end, and middle of words. The first two capture morphological aspects such as affixes and inflections, while the other character n-grams capture small function words and parts of longer words.

The second talk is “Hierarchic syntax improves reading time prediction”. There has been a controversy in computational psycholinguistic about whether humans use hierarchical syntax during language processing, prompted by a paper by Frank and Bod (2011), among others, which showed that human reading times can be better predicted by a simple linear model than a model that employs hierarchical syntactic information such as from constituency trees. This paper shows that the models they used can be improved, and that hierarchical syntax does improve over a linear model after all. Interestingly, long-distance dependencies is one of the ways to improve on the simpler baseline models. An interesting point raised in the Q&A is that the line between a linear and a hierarchical model can become fuzzy since it is not possible to rule out that a sufficiently complex linear model indirectly learns hierarchical information.
Video recording of talk

Oh, and if you’re ever in Denver, I recommend checking out the Mercury CafĂ©.