To the most complex one, Model 3. Model 3 was ranked the second in the GENIA Event subtask of the 2011 BioNLP shared task and its variant was ranked the first [12]. T0901317 chemical information However,Baek and Park Journal of Biomedical Semantics (2016) 7:Page 6 ofwe developed our model from Model 1 for convenience of experiments, since Model 3 was reported to be much slower than Model 1 in training and predicting. Given an assignment L, our model M first checks if the assignment L satisfies the following two conditions. One is that identified anchor words (i.e., constituent words of event triggers) have at least one edge labeled with THEME starting from them. Another is that all edges labeled with a role type come from identified anchor words. If the assignment satisfies the conditions, our model assigns scores Mi,e (Li,e |x) to all the pairs of an event type e and a word xi (i.e., vertices) and scores Mi,j (Li,j |x) to all the pairs of words xi and xj (i.e., edges) and take the sum of these scores as the score M(L) of the assignment L as follows: M(L) =(i,e)Mi,e (Li,e |x) +i,jMi,j (Li,j |x)(2)where Li,e takes on a value of either `positive’ or `negative’, while Li,j takes on a value of either `THEME’, `CAUSE’ or `negative’. Now the extraction of events can be viewed as finding the assignment with the highest score. To find the optimal assignment for a given sentence, we use a modified version of the dynamic programming algorithm proposed by Riedel and McCallum [2]. One may suppose that valid assignments should satisfy other constraints, such as the one that the edge labeled with role types goes to either anchor words or protein mentions. However, such constraints make it hard to efficiently find the optimal assignment of graphs. For this reason, the system first finds the optimal assignment without such constraints, and if the resulting assignment does not contain any cycles, we attempt to refine the assignment so that it satisfies all the constraints. For example, the label `negative’ is reassigned to all the incoming edges of a word other than the identified anchor words and protein mentions. When the resulting assignment has a cycle, it does not generate any events for the input sentence. We scored pairs of a word xi and an event type e using a weight vector we as follows: Mi,e (positive|x) = we ?Mi,e (negative|x) = -we ?(xi ), (xi ), (3a) (3b)”decrease:VBN”) and a special symbol `PROTEIN’ for protein mentions. For example, the center-marked trigram “either:CC [decrease:VBN] or:CC” is used as a feature for the word `decreased’ in sentence (1). They also include the distance from the word to proteins (e.g., “ProteinDistance:5” for the word `decreased’ and the protein `VDR’ in sentence (1)) and the distance from the word to potential anchor words within the sentences relative to them (e.g., “Trigger-Distance:2” for the word `decreased’ and the word `increased’ in sentence (1)). The distances of protein mentions are encoded as binary features (i.e., taking either 0 or 1), but features for the distances of potential anchor words take on the maximal reliability score of the corresponding entries in the lexicons. As syntactic contextual features, we encoded their syntactic governors and modifiers (e.g., “number:NNS-MOD(amod)-decrease:VBN” and “numbers:NNS-GOV(dobj)-express:VBP” for the word `numbers’ in sentence (1)). Note that these contextual features are intended to exploit words other than anchor words in sensing the event triggers PubMed ID:https://www.ncbi.nlm.nih.gov/pubmed/26100631 including them. We also score.