Link: www.coli.uni-saarland.de/projects/smile/data/data.tar Data consists of both development(dev) data as well as test data. Each of these directories(dev/test) consists of different script scenarios in separate sub-directory for e.g. preparing coffee is in the sub-directory test/coffee. This data comes from Regneri et al.[1] and OMICS corpus[2]. For details regarding preprocessing steps involved in extracting data please refer to [1],[2],[3] and [4](References below) Each of the scenario subdirectory consists of 4 files in xml format : event-ordering.xml event-paraphrasing.xml orig-data.xml segmented.xml ********************************************** event-ordering.xml : This file has the gold event-pairs for event ordering tasks. Each event is split into predicate and arguments. General format for each event pair is : Label can be of two types : FOLLOWUP(first event follows the second) or NO_FOLLOWUP(first event does not follow the second event). ********************************************** event-paraphrasing.xml : This file has the gold event-pairs for event paraphrasing task. Each event is split into predicate and arguments. General format for each event pair is : Label can be of two types : PARAPHRASE(two events are paraphrases) or NO_PARAPHRASE(two events are not paraphrases). ********************************************** orig-data.xml : This file consists of the original scripts for the scenario. In order words it is collection of ESDs for that scenario. General format is : ********************************************** segmented.xml : This file contains the training data for the scenario. It consists of sequence of events involved but the events are splited into predicate and arguments. General format is : ********************************************** REFERENCES: [1] Michaela Regneri, Alexander Koller, and Manfred Pinkal. 2010. Learning script knowledge with web experiments. In Proceedings of ACL. [2] Rakesh Gupta and Mykel J. Kochenderfer. 2004. Common sense data acquisition for indoor mobile robots. In Proceedings of AAAI. [3] Lea Frermann, Ivan Titov, and Manfred Pinkal. A hierarchical bayesian model for unsupervised induction of script knowledge. In EACL, Gothenberg, Sweden, 2014. [4] Ashutosh Modi and Ivan Titov. 2014. Inducing Neural Models of Script Knowledge. CoNLL 2014.