Textual Inference
In my PhD research, I have been working on Textual
Inference, which is a fundamental problem in natural language
understanding. Roughly speaking, the ultimate gold is to
enable the computer to do inferences based on natural language texts,
whether one text can be entailed by the other or they have the same
meaning. On the one hand, it touches the key issue of connecting
meaning representation with various linguistic expressions; on the
other hand, it also meets real applications, such as question
answering, information retrieval and extraction, machine translation,
etc.
The
previous work of my Master
Thesis used a subsequence-kernel-based
machine learning method to obtain the similarity between dependency
paths (Wang and Neumann, 2007a). Due to the
relatively high accuray and low coverage of the method, I developed
more specialized modules to deal with other cases of entailment, which
could not be covered by the previous approach. For instance, my
colleagues in DFKI and I investigated entailment cases with temporal
expressions (Wang and Zhang, 2008),
and also with other named-entities
(NEs) such as location names using a geographical ontology (Wang and
Neumann, 2008c). Recently I also collaborated with my colleague in
my
department to work on applying inference rules to this task (Dinu and
Wang, 2009).
I participated in Recognizing Textual Entailment
challenge
last year (RTE-4) with my colleagues in DFKI and we ranked the 3rd
place among all
the 26 groups from both research institutes and industry companies
(Wang and Neumann, 2009). The
evaluation on a common dataset shows
promising
results (more than 70% of accuracy), which encourages us to continue
this line of research. Furthermore, we also explored the possibility to
utilize this technology for other applications by participating in Answer
Validation Exercise
(AVE) in 2007 (Wang and Neumann,
2007b) and in 2008 (Wang and
Neumann,
2008b). We also showed that, in fact, this can be further extended
into
other validation tasks, such as relation validation (Wang and Neumann,
2008a).
Parsing
In order to get a proper Meaning Representation,
I also work on syntactic dependency parsing and semantic role labeling,
which are both fundamental tasks for natural language processing (NLP).
My colleagues and I actively
participated in the CoNLL 2008 shared task, and achieved the 2rd place
in the syntactic parsing, and the 7th place in the semantic role
labeling among all the 24 submissions (Zhang et al., 2008). In
addition, we also obtained
1st place for the open challenge, which any external resources could be
used. We were the only team to show improvement after using a
hand-crafted HPSG grammar.
Others
Apart
from these, I am also interested in many other NLP tasks, such as
closed-domain question answering (Wang and Yao, 2004), Chinese
question classification (Wang,
2005), question
answering on speech transcriptions (Neumann and Wang, 2007), NE
recognition (Wang et al., 2005),
and also opinion mining (Yao et al.,
2008).
Concerning cross-field collaborations, I worked with my friends
in computational biology department to extract protein mutation from
Biological literature (Wang et
al., 2009), and also with my friends in Italy on
sketch recognition (Avola et
al., 2009).