We propose a strategy to reduce the impact of the sparse data problem in the tasks of lexical information acquisition based on the observation of linguistic cues. It justifies that the uncertainty created by missing values, i.e. non-observed cues, can be handled by estimating its likelihood of being observable. Because of the Zipfian distribution of words, instead of estimating the likelihood from the data, we exploit the correlation drawn from the fact that a lexical class is based on the observation of different cues. We obtained experimental results that show a clear benefit of the proposed approach.
Bel Rafecas, Núria. Handling of missing values in lexical acquisition. In: -. Proceedings of LREC 2010. 1 ed. Valletta: ELRA; 2010. p. 2728-2735.