Back Collocation resources

The currently available collocation dataset is a list of about 10,000 collocations in English collected and tagged in terms of Lexical Functions (LFs) by I. Mel’čuk. In order to facilitate the use of this dataset in downstream NLP applications, we disambiguated the collocation bases (or “keywords” in the terminology of LFs) with respect to BabelNet synsets.

Please consult  here  the Readme for the precise description of the dataset

A subset of this dataset has been used in the experiments described in L. Espinosa-Anke, L. Wanner, and S. Schockaert. “Collocation Classification with Unsupervised Relation Vectors”, ACL 2019, Short paper track, Florence, Italy