|
1、Task description
2、Data set
3、Evaluation methodology
4、Copyright and license issue
5、Task schedule
6、References
NewVersion
Organizer
|
|
Word sense disambiguation (WSD) and semantic role labeling (SRL) are two important semantic analyzing techniques for NLP applications. To improve their performance, many evaluation tasks have been designed in recent years, including senseval-1/2/3 ([1], [2], [3]), CoNLL-2004/2005 ([4], [5]) and SemEval-2007 [7]. But almost all these evaluation tasks were designed to test one of these techniques, such as WSD tasks in senseval-1/2, SRL tasks for Propbank corpora in CoNLL-2004/2005 and SRL tasks for FrameNet corpora in senseval-3 and SemEval-2007. Only few tasks are designed for test the interaction of these two techniques, such as the frame semantic structure extraction task in SemEval-2007 [8]. Meanwhile, almost all these tasks are designed for the English language, only few tasks are designed for other languages, such as Chinese [9], Spanish [10], etc.
We think the event detection is an important semantic analysis task for real world sentences. We use a situation description formula to represent the content of an event. For example, for the ‘buy’ event, we use such a situation description formula: DO(x, P(x,y)) CAUSE (have(x,y) & NOT have(z,y)), where P, x, y and z are different situation arguments, ‘DO’, ‘CAUSE’, ‘&’ and ‘NOT’ are meta logical predicates, ‘have’ is a primitive predicate specially designed for ownership relation description.
In the task, we only focus on the following two event descriptions:
- Ownership relations and their changing, such as possession transferring;
- Existence states and their changing in different location and time situation, such as moving, living, died, etc.
We think they are the basic event units to describe other complex natural phenomena in natural language texts.
The goal of the task is to detect and analyze these event contents in real world Chinese news texts. It consists of finding key verbs or verb phrases to describe these events in the Chinese sentences after word segmentation and part-of-speech tagging, selecting suitable situation description formula for them, and anchoring different situation arguments with suitable syntactic chunks in the sentence. Three main sub-tasks are as follows:
- Target verb WSD: to recognize whether there are some key verbs or verb phrases to describe two focused event contents in the sentence, and select suitable situation description formula for these recognized key verbs (or verb phrases), from a situation network lexicon.
The input of the sub-task is a Chinese sentence annotated with correct word-segmentation and POS tags. Its output is the sense selection or disambiguation tags of the target verbs in the sentence.
- Sentence SRL: to anchor different situation arguments with suitable syntactic chunks in the sentence, and annotate suitable syntactic constituent and functional tags for these arguments.
Its input is a Chinese sentence annotated with correct word-segmentation, POS tags and the sense tags of the target verbs in the sentence. Its output is the syntactic chunk recognition and situation argument anchoring results.
- Event detection: to detect and analyze the special event content through the interaction of target verb WSD and sentence SRL.
Its input is a Chinese sentence annotated with correct word-segmentation and POS tags. Its output is a complete event description detected in the sentence (if it has a focused target verb).
The following is a detailed example to explain the above procedure:
For such a Chinese sentence after word-segmentation and POS tagging:
今天/n(Today) 我/r(I) 在/p(at) 书店/n(bookstore) 买/v(buy) 了/u(-ed) 三/m(three) 本/q 新/a(new) 书/n(book) 。/w (Today, I bought three new books at the bookstore.)
After the first processing stage: target verb WSD, we find there is a possession-transferring verb ‘买/v(buy)’ in the sentence and select the following situation description formula for it:
买/v(buy): DO(x, P(x,y)) CAUSE have(x,y) AND NOT have(z,y) [P=buy]
Then, we anchor four situation arguments with suitable syntactic chunks in the sentence and obtain the following sentence SRL result:
今天/n(Today) [S-np 我/r(I) ]x [D-pp 在/p(at) 书店/n(bookstore) ]z [P-vp 买/v(buy) 了/u(-ed) ]Tgt [O-np 三/m(three) 本/q 新/a(new) 书/n(book) ]y 。/w
Finally, we can get the following situation description for the sentence:
DO(x, P(x,y)) CAUSE have(x,y) AND NOT have(z,y) [x=我/r(I), y=三/m(three) 本/q 新/a(new) 书/n(book), z=书店/n(bookstore), P=买/v(buy)] |
We will prepare about 30,000 Chinese sentences for this task. Each sentence will manually annotate with following event description information:
- word segmentation and POS tags;
- the target verb (or verb phrase) in the sentence, its event type tag (whether it is or not the event we focused in the task), and the situation description formula if it is;
- different chunks annotated with suitable syntactic constituent tags, functional tags and the anchored situation argument tags.
All the sentences are extracted from the articles of Chinese People's Daily or our Tsinghua Chinese Treebank (TCT). The training, development and test set can be extracted from the data set. The following is an annotated example:
Sent. No.= 1930
Basic Annt.= 今天/n(Today) 我/r(I) 在/p(at) 书店/n(bookstore) 买/v(buy) 了/u(-ed) 三/m(three) 本/q 新/a(new) 书/n(book) 。/w
Target Verb= 买/buy
Verb Position= 4 // starting from 0
Event type= Yes // is a possession-transferring event that the task focused
Situation formula= DO(x,P(x,y))_CAUSE_(have(x,y)_&_NOT_have(z,y))+[P=Buy]
Syn-Sem Annt.= 今天/n(Today) [S-np 我/r(I) ]x [D-pp 在/p(at) 书店/n(bookstore) ]z [P-vp 买/v(buy) 了/u(-ed) ]Tgt [O-np 三/m(three) 本/q 新/a(new) 书/n(book) ]y 。/w
…… |
We design the following two types of measures to evaluate the analysis performance:
- Micro-measures, including Micro-Precision, Micro-Recall and Micro-F-measure, which are used to evaluate the analysis performance of each event token (key verb or verb phrase) in test set.
-
Micro-P = Number of correctly-analyzed tokens / Number of all recognized tokens * 100%
- Micro-R = Number of correctly-analyzed tokens / Number of gold-standard tokens * 100%
- Micro-F = (Micro-P + Micro-R) / (Micro-P * Mirco-R)
- Macro-measures, including Macro-Precision, Macro-Recall and Macro-F-measure, which are used to evaluation the analysis performance of overall events in test set.
-
Macro-P/R/F = Micro-P/R/F * wi, wi = frequency of the token in test set / total token frequency in test set
For the target verb WSD subtask, the evaluation measure is counted on the key verbs or verb phrases. The correct results should match the following conditions:
-
The selected event types and situation description formula of the tokens will be same with the gold-standard codes.
For the sentence SRL subtask, the evaluation measure is counted on the anchored argument chunks. The correct results should match all the following conditions:
-
The recognized chunks should have the same boundaries with the gold-standard argument chunks of the key verbs or verb phrases.
- The recognized chunks should have the same syntactic constituent and functional tags with the gold-standard ones.
- The recognized chunks should have the same situation argument tags with the gold-standard ones.
For the event detection subtask, the evaluation measure is counted on the complete event descriptions. The correct results should match all the following conditions:
-
The event type and situation description formula of the target verb should be same with the gold-standard ones.
- All the argument chunks of the event descriptions should be same with the gold-standard ones.
- The number of the recognized chunks should be same with the gold-standard one.
|
The data set is developed by the organizer. All the copyright of the annotated tags in the data set are owned by the organizer.
Because the data is not a freely available resource, all the participators of the task should sign a license agreement with the organizer to guarantee all the data set should be only used for this task. |
2009.6. Distribute trial/sample data with about 3000 annotated sentences with a working scorer.
2010.2. Provide full training and development set. |
[1] http://www.itri.brighton.ac.uk/events/senseval/ARCHIVE/index.html
[2] http://193.133.140.102/senseval2/
[3] http://www.senseval.org/senseval3/
[4] Carreras, X. and M`arquez, L. (2004). Introduction to the conll-2004 shared tasks: Semantic role labeling. In Proc. of CoNLL-2004.
[5] Carreras, X. and M`arquez, L. (2005). Introduction to the conll-2005 shared tasks: Semantic role labeling. In Proc. of CoNLL-2005.
[6] http://nlp.cs.swarthmore.edu/semeval/index.php
[7] Collin Baker, Michael Ellsworth and Katrin Erk (2007) Frame Semantic Structure Extraction. SemEval task #19. http://nlp.cs.swarthmore.edu/semeval/tasks/task19/summary.shtml.
[8] Peng Jin, Yunfang Wu, Shiwen Yu. (2007) Multilingual Chinese-English Lexical Sample Task. SemEval task #5. http://nlp.cs.swarthmore.edu/semeval/tasks/task05/summary.shtml
[9] Lluís Màrquez, Maria Antònia Martí, et. Al. (2007) Multilevel Semantic Annotation of Catalan and Spanish. SemEval task #09. http://nlp.cs.swarthmore.edu/semeval/tasks/task09/summary.shtml
[10] Dong Z. D, Dong Q.: Hownet. http://www.keenage.com
[11] Mei Jiaju, et. (1983) TongYiCi CiLin. Shanghai dictionary press, Shanghai, China.
[13] Xiandai Hanyu Cidian (Contemporary Chinese Dictionary). (1991). Business Press, Beijing.
[14] Nianwen Xue and Martha Palmer. (2003). Annotating Propositions in the Penn Chinese Treebank, In Proceedings of the 2nd SIGHAN Workshop on Chinese Language Processing, in conjunction with ACL'03. Sapporo, Japan
[14] N.W. Xue. (2006) Annotating the predicate-argument structure of Chinese nominalizations. In Proc. of the 5th International Conference on Language Resources and Evaluation. P1382-1387. Genoa, Italy. |
Qiang ZHOU
Centre for Speech and Language Technologies
Research Institute of Information Technology
Tsinghua University
zq-lxd@mail.tsinghua.edu.cn |
|