Rent from binary relations, in that multiple protein-taking events take more than one argument and event-taking events allow nested event structures. Thus, the extraction of events poses challenges other than those of the extraction of binary relations, which have been extensively studied in the biomedical information extraction community.MethodsFollowing Bj ne and colleagues [5], we viewed the event extraction task as constructing directed graphs, where event triggers and event-argument relations are encoded with labeled nodes and edges, respectively. We constructed these directed graphs with the help of various resources including syntactic analyses. In this section, we first describe these resources used in our event extraction system and then develop graph representations, statistical models and learning algorithms, in this order.ResourcesAs a case study, we addressed the event extraction task as defined in the 2009 BioNLP shared task 1 [3], which was later renamed as GENIA Event Task 1 and extended to cover full papers in the 2011 BioNLP shared task [7], where biological events are used to refer to the changes of a state of one or more biological macromolecules. The task is to extract structured information on events from sentences in the biological literature, which consists of their event type and participants encoded with a controlled vocabulary that consists of nine event type terms (e.g., Gene Expression) and two role type terms (i.e., THEME and CAUSE). The nine event types are divided into three groups according to their participants. The first group is plain protein-taking events that must take a single protein as THEME (e.g., Gene Expression). The second one is multiple protein-taking events, or events that take one or more proteins as THEME (e.g., Binding events). The third one is event-taking events that must take a single protein and event as THEME and may take a single protein and eventWe used lexical and syntactic analyses to PubMed ID:https://www.ncbi.nlm.nih.gov/pubmed/27607577 encode tokens and the relation between tokens into statistical models. As for lexical analyses, we used the baseforms and part-of-speech (POS) tags of the tokens included in the analyses by the Enju parser, which are available in the official website of BioNLP shared tasks (http://weaver. nlplab.org/ bionlp-st/BioNLP-ST/downloads/supportdownloads.html). As for syntactic analyses, we use basic Stanford dependency analyses by the Enju parser with the GENIA model [8] together with those by the CharniakJohnson parser [9] with a self-trained biomedical parsing model [10], since the Enju parser fails to generate analyses for a few sentences. These syntactic analyses are also available in the official website of BioNLP shared tasks. As for protein mentions, we used their gold-standard annotations available on the official website of BioNLP shared tasks, which were given to the participants in the BioNLP shared tasks. The annotations contain multiword protein mentions. Since most of them correspond to syntactic units (i.e., single words and phrases), we can easily Chloroquine (diphosphate) msds combine tokens in multi-word protein mentions into single tokens and redirect their dependency relations. Following Miwa and colleagues [1] and Kilicoglu and Bergler [11], we developed an event trigger lexicon for each event type for the purpose of identifying apparently incorrect candidates for event triggers as follows. Constituent words within annotated event triggers in the training corpus are scanned one by one. Each scanned constituent wor.