PhD research

Laura Alonso i Alemany

In my PhD research I wanted to provide a model of discourse structure that was useful for automated text summarization, which resulted in the thesis

Representing discourse for automatic text summarization via shallow NLP

where I give an account of the contribution of shallow textual cues to obtain a partial representation of discourse that is useful for assessing relevance and coherence of texts, to improve automatic summarization. [printable version] [screenable version]
or else take a look at the slides that I used to present my work.
Any comments will be most welcome!

From all textual cues with discursive meaning, I have mainly focussed on discourse markers. I have tried to analyze and systematize the meaning of discourse markers that is useful for text summarization with shallow techniques. You can check, use and criticize the starting parallel lexicon of discourse markers in Catalan, Spanish and English I collected and characterized.

We have applied the representation of discourse that I propose to hand-tag a small corpus of journalistic articles in Spanish. The corpus annotated with discursive information provides insightful data on the behaviour of different discursive meanigns, which we began to explore in this paper.

I have implemented this proposal for a representation in a discourse shallow parser used by the e-mail summarizer CARPANTA to produce nicer summaries of e-mail. Do you want to try?