Guillaume Jacquet's Home Page

Research Interests

Information Extraction, Word Sense Disambiguation, Named Entity Recognition, Textual Entailment, Opinion Mining, Distributional Analysis, hybrid methods.

I'm Research Scientist at Xerox Research Centre Europe (XRCE) since 2006, Grenoble, in the Parsing & Semantics area (former manager: F. Segond, new manager: TBC). My current research topics are around semantic information extraction from text, with a focus on the following approaches:

Named Entities processing (recognition, fine-grained annotation, disambiguation, metonymy resolution), including cross-document coreference.
Textual Entailment task (combining multiple linguistic phenomenon processings to determine if a text segment entails another one).
Linguistic information extraction for opinion mining task.
Hybrid methods development (combining symbolic and statistic approaches).

Research Projects

2009-today: SynC3 project

SynC3 is a European project involving 9 partners (research labs, journalistic organizations, companies such as Xerox and Google). (official web site).

Overview: SynC3 addresses the business areas of news reporting and public opinion formation to deliver a framework that combines news events and their relations with users and their opinions and produces the SynC3 graph of the commentaries of blog users on current news. Towards this direction, SynC3 will develop accurate methods for connecting users to other users and news content.

XRCE is involved in this project as a technology provider. Namely to provide linguistic information from news articles and to cluster the news articles into news events based on an approach combining statistical processing and linguistic methods applied to both low level and high level features.

In this project, I'm strongly involved in the linguistic information extraction part and partly involved in the news article clustering part.

2006-2009: Infom@gic project

Infom@gic is a french project from the "pôle de compétitivité Cap Digital" involving 30 partners (research labs or companies such as Xerox, EADS or Thales). (official web site).

Overview: develop new tools for information extraction, retrieval and analysis from multi-source data (text, image, sound, structured data).

XRCE is involved in this project as the leader of the text processing and in different tasks such as information extraction using linguistic methods, different use-cases such as risk detection task. XRCE was also involved in the UIMA platform development which was used during the project.

I have been strongly involved in these aforementioned tasks.

2007: Semeval-2007

This is the 4th International Workshop on Semantic Evaluations. We (Brun et al., 2007) participated in the "named entity metonymy resolution" task and our system came second and third respectively for the location and organization named entity types.

2003-2005: ILF Project. Title: « polysemic verbs: the role of syntactic constructions ».

Project involving the LATTICE (Paris), CRISCO (Caen) and ERSS (Toulouse) laboratories.

Overview: theoretical study of verbal polysemy and its relation with syntactic constructions. Implementation of this theoretical study based on a graph approach.

I was responsible for the integration of all the modules used for this project: Integrating the parsers "Syntex" (D. Bourigault) and "Wims" (E. Giguet) with the graph processing software "Visusyn", importing text data from "Frantext" (ATILF) and implementing the sense calculation model.

Teaching

2006-today: University of Grenoble3, France

Graduate course : Semantic and NLP

2002-2005: University of Paris-Dauphine, France

Graduate course: Operational Research (using Graph Theory for Mathematical problems)
Undergraduate course: Computer programming (JAVA)

Education

Ecole Normale Supérieure / University of Paris VI / Ecole Polytechnique

2005: Ph.D. in Cognitive Sciences (specialized in Computational Linguistics)
Title: A continuous model for Verb Sense Disambiguation (Polysémie verbale et calcul du sens).
supervisor B. Victorri, Lattice lab (CNRS-ENS).

2002: Master (Graduate Degree) in Cognitive Sciences (specialized in Computational Linguistics)

University of Paris IV

2001: M.S.T. (equivalent B.A.) in Mathematics and Computer Science applied to Social Sciences.

Publications and patents

Boards and academic activities

2009-today: Board's member of the UFR "Sciences du Langage" (Linguistics), Grenoble 3.

2009-today: Board's member of ATALA (French National association on NLP).

2007-today: Member of the jury for XRCE patent proposal evaluation.

Program committee member/reviewer: EACL 2009, Semeval 2007, LTC 2007, FinTAL 2006, TALN 2008-2012 and TAL journal.

Technical skills

Systems: Windows, Unix

Programming Languages : Matlab, Java, Python, Perl

Frameworks : UIMA, Hadoop

Languages : French and English