Visualització del contingut web

 

Necsulescu, Silvia (2016)

Automatic acquisition of lexical-semantic relations: gathering information in a dense representation

Dir.: Núria Bel

Lexical-semantic relationships between words are key information for many NLP tasks, which require this knowledge in the form of lexical resources. This thesis addresses the acquisition of lexical-semantic relation instances. State of the art systems rely on word pair representations based on patterns of contexts where two related words co-occur to detect their relation. This approach is hindered by data sparsity: even when mining very large corpora, not every semantically related word pair co-occurs or not frequently enough. In this work, we investigate novel representations to predict if two words hold a lexical-semantic relation. Our intuition was that these representations should contain information about word co-occurrences combined with information about the meaning of words involved in the relation. These two sources of information have to be the basis of a generalization strategy to be able to provide information even for words that do not co-occur.

http://www.tdx.cat/handle/10803/374234

Zaytseva, Victoria (2016)

Vocabulary acquisition in study abroad and formal instruction: an investigation on oral and written lexical development

Dirs.: Carmen Pérez & Immaculada Miralpeix

The present study investigates the impact of two different consecutive learning contexts, formal instruction (FI) at home and a 3-month stay abroad (SA), on second language (L2) vocabulary acquisition in oral and written production. Data were obtained from a group of 30 Catalan/Spanish advanced learners of English before and after each learning period by means of an oral interview and a written composition. These samples were analyzed in terms of quantitative lexical proficiency measures in the domains of fluency, density, diversity, sophistication and accuracy, and through qualitative native-like selections. Baseline data from 29 native speakers of English, elicited through the same tasks, were also used for comparison purposes. Results reveal that SA is particularly beneficial for written productive vocabulary, and less so for oral, and that progress occurs especially in lexical fluency and diversity. FI, in contrast, shows a modest effect on the improvement of oral productive vocabulary and affects namely lexical sophistication. Furthermore, initial level of vocabulary knowledge is found to be a significant predictor of gains.

http://www.tdx.cat/handle/10803/387120

Arias, Blanca (2017)

Television dialogue and subtitling: a corpus-driven study of police procedurals

Dir.: Sergi Torner & Jenny Brumme

La bibliografía especializada ha sugerido la posición del diálogo televisivo y del subtitulado como géneros intermedios en el continuo oralidad-escritura (p. ej. Díaz-Cintas 2003, Quaglio 2009; Forchini 2012). Esta tesis adopta la metodología corpus-driven (‘dirigida por el corpus’) para abordar esta cuestión desde un punto de vista descriptivo y contrastivo, a partir del análisis del Corpus of Police Procedurals (CoPP), un corpus compilado para los propósitos de esta investigación que contiene, alineados, el diálogo (EN) y el subtitulado para DVD (ES) de quince capítulos de tres series de ficción policíaca procesal contemporáneas: Dexter (Showtime, 2006), El mentalista (Warner Bros, 2008) y Castle (ABC, 2009). Una selección de rasgos sintácticos y léxicos prototípicamente atribuidos a ambos polos del continuo han sido examinados tanto cuantitativa como cualitativamente. La base estadística de los análisis cuantitativos llevados a cabo revela patrones de comportamiento (normas) en los creadores del diálogo ficcional y en sus traductores. El análisis cualitativo del léxico adapta la metodología lexicográfica de análisis de patrones de corpus (CPA) propuesta por Hanks (esp. 2004, 2013a) para el estudio de la explotación léxica (creatividad) en este tipo de textos.

http://www.tdx.cat/handle/10803/404733

Vigo, Eugenio (2016)

Copular inversion and non-subject agreement

Dir.: Àlex Alsina

In this thesis I propose an explanation for the facts of copular inversion in Spanish, Catalan, and other Romance languages, as well as in German. Copular inversion is a phenomenon found in some languages, in which, at least superficially, the copula may be found agreeing with the postverbal DP instead of the preverbal DP. At first sight it appears that the agreeing postverbal DP is the subject of the sentence, but in this work I provide evidence that this is not the case: the agreeing postverbal DP is, in fact, the complement of the copula. This yields a singular case of non-subject agreement in Spanish, Romance and the rest of copular inversion languages that is not found in the rest of the grammar of these very same languages (e.g. they do not ever show object-agreement in transitive sentences). This requires an explanation that is integrated with the rest of the grammars of the languages. I claim that coreference is the driving force behind the presence of copular inversion: in copular inversion languages, all verbs actually seek agreement with it and all those grammatical functions that are coreferential with the subject. In intransitive and transitive sentences, the only possible candidate is the subject, but in copular sentences the complement is usually coreferential with the subject. The choice of the agreeing function among the possible candidates is decided with respect to a Person-Number Hierarchy: the copula will always agree with the function that has the most marked person and number agreement features with respect to it. This requires challenging the standard view of LFG by which the lexical entries of verbs determine the person and number features of the subject: the solution requires accepting that the person and number features of the verb must be represented in a function-independent “bundle” that is unified with the right grammatical function according to syntactic well-formedness constraints in an OT setting. Additionally to explain the facts of copular inversion languages, the proposed OT-LFG hypothesis predicts why other languages do not have copular inversion. Moreover, the proposed hypothesis can easily be extended to other phenomena of non-subject agreement, e.g. Catalan cleft sentences, Icelandic non-subject agreement in “quirky case” constructions, English locative inversion and agreement phenomena in the Dargwa family of languages.

http://www.tdx.cat/handle/10803/397778