Propbank Instance Annotation Guidelines Using a Dedicated Editor, Jubilee

1
Propbank Instance Annotation Guidelines Using a Dedicated Editor, Jubilee Jinho Choi, Claire Bonial, Martha Palmer Institute of Cognitive Science, University of Colorado at Boulder • A corpus in which the arguments of each verb predicate are annotated with their semantic roles. • Each predicate is also annotated with its sense ID. • Annotations are done over syntactic trees. Propbank • Each task is claimed, double-annotated, and adjudicated. • In the past, three different tools were used: 1. to claim tasks. 2. to annotate arguments. 3. to annotate verb senses. Propbank Annotation Procedure Sense: open.01 ARG 0 (agent) ARG 1 (theme) ARG 2 (instrument) Task 1 Task 2 Annotation 2-1 Annotation 2-2 Adjudication 2 Double-annotate Claim Adjudicate Jubi lee 2 1 3 α Jubilee Jubilee main window Frameset view • Displays and allows annotators to choose the sense (roleset) of the predicate with respect to the current tree. Lemma of the predicate for the selected roleset List of roleset IDs for the predicate View examples of the selected roleset A definition and a generalized argument structure of the selected roleset Argument view • Contains buttons representing Propbank argument labels. Claiming tasks • Choose a Propbank project. • Choose a task from either: - New tasks: claimed by one or less annotator. - My tasks: claimed by the current annotator. Treebank view in adjudication mode • Displays and allows adjudicators to choose or edit from multiple annotations. Adjudicat or ID Multiple annotations Treebank view in annotation mode • Displays syntactic trees in the selected task. List of tree IDs Annotato r ID Navigation buttons Raw sentence of the tree Annotation vs. adjudication mode • In annotation mode, annotators are allowed to view and edit only tasks claimed by themselves or one other annotator. • In adjudication mode, adjudicators are allowed to view and edit all tasks that have undergone at least single-annotation. Advantages and Features Speed up: argument and sense annotations are simultaneous. Unified format: the use of one tool simplifies data maintenance. Syntax visualization: syntax is easily understandable to annotators. Semantic supply: frameset info is provided to consult annotators. Multilingual: accommodates Arabic, Chinese, English, Hindi and Korean. Platform independent: runs on any platform with JVM (Java 6.0). Run on X11: annotators can make updates remotely. Operators • In the absence of Treebank co-indexing, annotators can provide semantic information about a null element by manually linking it to its overt referent using the ’ operator . • In the cases where an argument is discontinuous such that it cannot be captured in the annotation of one node, the ‘,’ operator is used. • The ‘&’ operator is used to link the object trace after a passive verb to its referent in the subject position in reduced relative clauses. How to obtain Jubilee • Available as an open source project on Google code ( http://code.google.com/p/propbank ). Contact: [email protected] More about Jubilee • Special thanks are due to Professor Nianwen Xue of Brandeis University for his very helpful insights, as well as Scott Cotton, the developer of RATS and Tom Morton, the developer of WordFreak, both previously used for Propbank annotation. • We also gratefully acknowledge the support of the National Science Foundation Grants CISE- CRI-0551615, Towards a Comprehensive Linguistic Annotation and CISE- CRI 0709167, Collaborative: A Multi-Representational and Multi- Layered Treebank for Hindi/Urdu, and a grant from the Defense Advanced Research Projects Agency (DARPA/IPTO) under the GALE program, DARPA/CMO Contract No. HR0011-06-C-0022, subcontract from BBN, Inc. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. Acknowledgements

description

This paper gives guidelines of how to annotate Propbank instances using a dedicated editor, Jubilee. Propbank is a corpus in which the arguments of each verb predicate are annotated with their semantic roles in relation to the predicate. Propbank annotation also requires the choice of a sense ID for each predicate. Jubilee facilitates this annotation process by displaying several resources of syntactic and semantic information simultaneously: the syntactic structure of a sentence is displayed in the main frame, the available senses with their corresponding argument structures are displayed in another frame, all available Propbank arguments are displayed for the annotators choice, and example annotations of each sense of the predicate are available to the annotator for viewing. Easy access to each of these resources allows the annotator to quickly absorb and apply the necessary syntactic and semantic information pertinent to each predicate for consistent and efficient annotation. Jubilee has been successfully adapted to many Propbank projects in several universities. The tool runs platform independently, is light enough to run as an X11 application and supports multiple languages such as Arabic, Chinese, English, Hindi and Korean.

Transcript of Propbank Instance Annotation Guidelines Using a Dedicated Editor, Jubilee

Page 1: Propbank Instance Annotation Guidelines Using a Dedicated Editor, Jubilee

Propbank Instance Annotation GuidelinesUsing a Dedicated Editor, Jubilee

Jinho Choi, Claire Bonial, Martha PalmerInstitute of Cognitive Science, University of Colorado at Boulder

• A corpus in which the arguments of each verb predicate are annotated with their semantic roles.• Each predicate is also annotated with its sense ID.• Annotations are done over syntactic trees.

Propbank

• Each task is claimed, double-annotated, and adjudicated.

• In the past, three different tools were used:1. to claim tasks.2. to annotate arguments. 3. to annotate verb senses.

Propbank Annotation Procedure

Sense: open.01

ARG0 (agent)

ARG1 (theme)

ARG2 (instrument)

Task 1

Task 2

Annotation 2-1

Annotation 2-2

Adjudication 2Double-annotateClaimAdjudicate

Jubilee2

1

Jubilee

Jubilee main window

Frameset view

• Displays and allows annotators to choose the sense (roleset) of the predicate with respect to the current tree.

Lemma of the predicatefor the selected roleset

List of roleset IDsfor the predicate

View examplesof the selected roleset

A definition and a generalized argument structure of the selected roleset

Argument view

• Contains buttons representing Propbank argument labels.

Claiming tasks

• Choose a Propbank project.• Choose a task from either:- New tasks:

claimed by one or less annotator.- My tasks:

claimed by the current annotator.

Treebank view in adjudication mode

• Displays and allows adjudicators to choose or edit from multiple annotations.

Adjudicator IDMultiple annotations

Treebank view in annotation mode

• Displays syntactic trees in the selected task.

List of tree IDs Annotator IDNavigation buttons

Raw sentence of the tree

Annotation vs. adjudication mode

• In annotation mode, annotators are allowed to view and edit only tasks claimed by themselves or one other annotator.• In adjudication mode, adjudicators are allowed to view and edit all tasks that have undergone at least single-annotation.

Advantages and Features

• Speed up: argument and sense annotations are simultaneous.

• Unified format: the use of one tool simplifies data maintenance.

• Syntax visualization: syntax is easily

understandable to annotators.

• Semantic supply: frameset info is

provided to consult annotators.

• Multilingual: accommodates Arabic,

Chinese, English, Hindi and Korean.

• Platform independent: runs on any platform with JVM (Java 6.0).

• Run on X11: annotators can

make updates remotely.

Operators• In the absence of Treebank co-indexing, annotators can provide semantic information about a null element by manually linking it to its overt referent using the ‘ ’ operator★ .• In the cases where an argument is discontinuous such that it cannot be captured in the annotation of one node, the ‘,’ operator is used.• The ‘&’ operator is used to link the object trace after a passive verb to its referent in the subject position in reduced relative clauses.

How to obtain Jubilee• Available as an open source project on Google code (http://code.google.com/p/propbank).• Contact: [email protected]

More about Jubilee

• Special thanks are due to Professor Nianwen Xue of Brandeis University for his very helpful insights, as well as Scott Cotton, the developer of RATS and Tom Morton, the developer of WordFreak, both previously used for Propbank annotation.• We also gratefully acknowledge the support of the National Science Foundation Grants CISE-CRI-0551615, Towards a Comprehensive Linguistic Annotation and CISE- CRI 0709167, Collaborative: A Multi-Representational and Multi-Layered Treebank for Hindi/Urdu, and a grant from the Defense Advanced Research Projects Agency (DARPA/IPTO) under the GALE program, DARPA/CMO Contract No. HR0011-06-C-0022, subcontract from BBN, Inc. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Acknowledgements