nuria.bel youknow upf.edu

Institut Universitari de Lingüística Aplicada

Universitat Pompeu Fabra

+34 935422307 & +34 935422322

 
 

 

 

 

 

 

 

  

Bio

Tenured Associate Professor (Professora Agregada) at the Department de Traducció i Ciències del Llenguatge of the Universitat Pompeu Fabra and senior researcher at the Institut Universitari de Lingüística Aplicada - IULA also at the UPF. My area of research is Natural Language Processing (NLP) and my current interests are mainly related to the automatic acquisition and exploitation of Language Resources. I am now leading a research group that works in projects mostly related to Technologies of Language Resources.


I started working in NLP in 1986 when I received a grant for collaborating with the group that was developing the Spanish modules for the machine translation METAL system in Siemens-Barcelona. In 1987, I moved to work for the EU project EUROTRA also in Machine Translation, hosted by the Universitat de Barcelona. In 1993, I took the technical direction of a small group of researchers in the UB and continued working in the field of NLP in different research projects with the name Grup d'Investigació en Lingüística Computacional - Universitat de Barcelona. Projects of that 10 years period are here: gilcUB. These projects were mainly related to the development of Language Resources and NLP applications (MULTEXT, PAROLE, SIMPLE, TRADE, PEKING). In 2003 I was awarded with a 5 years grant by the Ramón y Cajal Program of the Spanish Ministerio de Educación y Ciencia, and it was attached to the IULA at the Universitat Pompeu Fabra. The research in that period has been mostly devoted to the development of systems of Automatic Acquisition of Lexical Information, to develop lexica for deep linguistic parsing and Machine Translation
.

 

Complete CV  

 

 

Current projects:

PANACEA: Platform for the Automatic, Normalized Acquisition of Language Resources for Human Language Technologies, (2010-2012), funded by the Language Technologies Area, Information and Communication Technologies, of the 7th Framework Programme (7FP-ITC-248064). PANACEA will develop technologies for the automation of all stages involved in the acquisition, production, updating, validation and maintenance of Linguistic Technologies and Resources. The project, coordinated by our group, counts with the participation of Cambridge University, the Istituto di Linguistica Computazionale, Italy, the Institute for Language and Speech Processing, Greece, Dublin City University, Ireland and two companies, the german Linguatec and the french ELDA, Evaluation and Language Resources Distribution Agency (http://www.panacea-lr.eu).

CLARA Initial Training Network for Common Language Resources and their Applications, has started its activities on 1st of December, within the context of the Marie Curie programme of the EU 7th Framework Programme ( Marie Curie Initial Training Network 7FP-ITN-238405). The objective of the CLARA network is to launch the training of a new generation of experts in linguistics that can develop methods of research for the construction, the use and the applications of language resources. The scientific objectives of CLARA are to go in greater depth into the creation of linguistic models based on real data that are then analysed with statistical and machine-learning tools, and on the hybridisation of techniques and methods of analysis. CLARA will fund a total of 17 training grants in different areas related to the creation, the use, and the applications of language resources. The calls will be being made public on the web page of the project http://clara.uibo.no and in Euraxess http://ec.europa.eu/euraxess.

Flarenet: Fostering Language Resources Network, funded by the e-contentplus program of the European Union, Flarenet is a networking organization whose aims are devising and promoting consensual recommendations concerning the future development, deployment and use of LRs. Flarenet will indicate best practices and best policies for coordinating future actions and projects. The major activities of the Network will be to survey, analyse, classify LRs and relevant standards, together with their organisational and economic models, and discuss with major stakeholders and players upon new common strategies for a capillary deployment and use of LRs in real-world products.(http://www.flarenet.eu)

Clarin: Common Language Resources and Technologies Infrastructure, in Spain co-funded by the 7FP of the EU (FP7-INFRASTRUCTURES-2007-1-212230) and the Spanish Ministerio de Educación y Ciencia(CAC-2007-23) and Ministerio de Ciencia e Innovación (ICTS-2008-11 and ACI2009-0995). CLARIN is committed to establish an integrated and interoperable research infrastructure of language resources and its technology. It aims at lifting the current fragmentation, offering a stable, persistent, accessible and extendable infrastructure and therefore enabling eHumanities. clarin-es.iula.upf.edu and www.clarin.eu.

Clarin-CAT, funded by the Departament d'Innovació, Universitats i Empresa of the Generalitat de Catalunya, this project is committed to the integration of the Catalan language in CLARIN by the development of a demonstrator. This demonstrator will integrate resources in Catalan as well as exploitation tools into the European Infraestructure CLARIN.

Past research projects (2004 - 2008):

Adquisición automática de información léxica (AAILE y AAILE2), funded by the Ministry of Education and Culture (HUM2004-05111-C02-01/FILO and HUM2007-61067/FILO). The goal of our research is to study the feasibility of the automatic acquisition of the information contained in computational lexicons from corpus. The methodology is by using syntactic restrictions to bias the data, checking the lexical representation against experimental observations. Eventually, what deserves our interest in this area is to understand the role of the syntactic and semantic constraints that operate in texts, and in the feasibility of acquiring related information. Finding how Machine Learning methods can capture them will allow us to improve both the applications aiming at automatic acquisition of lexical information as well as the representation of the lexicon itself. AAILE web page

Linguistic Infrastructure for Interoperable Resources and Systems (LIRICS), funded by e-content program of European Union (EDC-22236). Duration of the project: 2004-2006. The key objective of LIRICS is to provide the European content and language industries with a common and stable set of formats, in the form of ISO standards, enabling interoperability and reuse of multilingual language resources, digital content and language engineering software.

Traducció automàtica de codi obert per al català (TACOC), funded by the Catalan government, its goal is the development of machine translation modules for Catalan-French, Catalan-Aranès, Catalan-English for the open source shallow-transfer machine translation platform Apertium developed by the Group Transducens, University of Alicante. [finished 02-07] DEMO at http://xixona.dlsi.ua.es/apertium/

Traducció automàtica de codi obert per a l'esperanto. funded by ABC ENCIKLOPEDIOJ S.L. and with the support of Xavier Batlle, Associació Catalana d'Esperanto, Miguel Gutiérrez Adúriz and Prompsit (Group Transducens Universitat d'Alacant). It developed machine translation modules for Catalan > Esperanto and Spanish > Esperanto. DEMO at http://xixona.dlsi.ua.es/apertium-unstable/

 

  

 

 

 

Current research topics:

 

Automatic acquisition of Language Resources: AAILE web page www.flarenet.eu

Technologies of language resources

Machine Translation

Modelling of linguistic information contained in lexica for deep linguistic analysis

Grammars for deep linguistic analysis

Teaching 2009-2010

Degree/Master in Linguistics and Technological Applications: 13305 - Fonaments de Processament del Llenguatge Natural

Degree in Translation:

 

  

 

 

Publications, Publicaciones, Publicacions

Resnik, Gabriela; Bel, Núria (2009). "Automatic Detection of Non-deverbal Event Nouns in Spanish" in -- Proceedings of the 5th International Conference on Generative Approaches to the Lexicon. Pisa: Istituto di Linguistica Computazionale (pdf).

Villegas, Marta; Bel, Núria, Bel, Santiago; Alemany, Francesca; Martínez, Hèctor (2009). "Lexicography in the grid environment " in Proceedings of e-lex 2009. Lovaina: Cahiers du Cental. Pàg. (in press).

Bel, Núria; Calzolari, Nicoletta (2009). "FLaReNet: una red para fomentar los recursos lingüísticos" in Procesamiento del Lenguaje Natural 43. Pàg. 383-384. (ISSN 1135-5948) (http://www.sepln.org/revistaSEPLN/revista/43/articulos/art51.pdf)

2008

Bel, Núria (2008). "Review of Dubkjaer, Laila; Hemsen, Holmer; Minker, Wolfgang, eds. Evaluation of Text and Speech Systems, Springer, 2007 " Machine Translation 21(1). The Neetherlands: Springer Netherlands. Pàg. 73-76 (ISSN 0922-6567).

Francopoulo, Gil; Bel, Nuria; George, Monte; Calzolari, Nicoletta; Monachini, Monica; Pet, Mandy; Soria, Claudia (2008) Multilingual Resources for NLP in the Lexical Markup Framework (LMF). In Witt, Andreas; Seraset, Guilles (eds.). Multilingual Language Resources and Interoperability. Journal of Language Resources and Evaluation (ISSN 1572-8412).

Bel, Núria; Espeja, Sergio; Marimon, Montserrat (2008). "The Structure of the Lexicon in the Task of Automatic Lexical Acquisition" dins Bernal, E.; DeCesaris, J. (eds.) Proceedings of the XIII EURALEX International Congress (Barcelona, 15-19 July 2008). Barcelona: Institut Universitari de Lingüística Aplicada. Universitat Pompeu Fabra; DOCUMENTA UNIVERSITARIA. Pàg. 285-290. ISBN 978-84-96742-67-3

Bel, Núria; Bel, Santiago (2008). Measuring standards in Lexical Resources. Witt, Andreas et al. (eds.) Proceedings of the LREC 2008 Workshop: Uses and Usage of Language Resource-related Standards. 6th International Conference on Language Resources and Evaluation. Paris: European Language Resources Association (ISBN 2-9517408-4-0) (pdf)

Bel, Núria; Espeja, Sergio; Marimon, Montserrat (2008): COLDIC, a Lexicographic Platform for LMF compliant lexica. Proceedings of the 6th International Conference on Language Resources and Evaluation. Paris: European Language Resources Association (ISBN 2-9517408-4-0) (pdf)

Bel, Núria; Espeja, Sergio; Marimon, Montserrat (2008): Automatic acquisition for low frequency lexical items. Proceedings of the 6th International Conference on Language Resources and Evaluation. Paris: European Language Resources Association (ISBN 2-9517408-4-0) (pdf)

Bel, Nuria; Bel, Santiago; Espeja, Sergio; Marimon, Montserrat; Villegas, Marta. (2008). "El Proyecto CLARIN: Una infraestructura de investigación científica para las Humanidades y las Ciencias Sociales". Digithum 10. Barcelona: Universitat Oberta de Catalunya, pp. 1-9 (ISSN 1134-7724) (http://digithum.uoc.edu/esp/index.html)

Bel, Núria ; Marimon, Montserrat (2008). "CLARIN, Common Language Resources and Technology Infrastructure" dins Procesamiento del Lenguaje Natural 41. Alicante: Sociedad Española para el Procesamiento del Lenguaje Natural, Univ. de Alicante, Dep. Lenguajes y Sistemas Informáticos. Pàg. 309-311. (ISSN 1135-5948) (http://www.sepln.org/revistaSEPLN/revista/41/proy1.pdf)

2007

Marimon, Montserrat; Núria Bel and Natalia Seghezzi (2007). Test Suite Construction for a Spanish Grammar. In T. Holloway King and E. M. Bender (eds.) Proceedings of the Workshop on Grammar Engineering across Frameworks. CSLI's series "Studies in Computational Linguistics ONLINE", pp. 250-264. ISSN 1557-5772. (http://csli-publications.stanford.edu/GEAF/2007/geaf07-toc.html)

Marimon, Montserrat; Natalia Seghezzi and Núria Bel. An Open-source Lexicon for Spanish. Procesamiento del Lenguaje Natural, n. 39, pp. 131-137. September, 2007. ISSN 1135-5948 (pdf)

Marimon, Montserrat; Bel, Núria; Espeja, Sergio; and Seghezzi, Natalia (2007). "The Spanish Resource Grammar: Pre-processing Strategy and Lexical Acquisition" dins Baldwin, Timothy et al. (eds.) Proceedings of the ACL2007 Workshop on Deep Linguistic Processing. Stroudsburg, PA 18360: Association for Computational Linguistics. Pàg. 105-111. 2007. ISBN 978-1-932432-88-6 (pdf)

Bel, Núria; Espeja, Sergio; Marimon, Montserrat. "Automatic Acquisition of Grammatical Types for Nouns" dins Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers. Rochester, New York: Association for Computational Linguistics. 2007. Pàg. 5-8. ISBN 1-932432-94-9 (pdf).

2006

Bel, Núria; Espeja, Sergio; Marimon, Montserrat. "New tools for the encoding of lexical data extracted from corpus". Proceedings of the 5th International Conference on Language Resources and Evaluation. Paris: European Language Resources Association. 1362-1367, Génova, Italia. 2006

Francopoulo, G., George, M.; Calzolari, N.; Monachini, M.; Bel, N.; Pet, M.; Soria, C." Lexical Markup Framework, LMF ". Proceedings of the 5th International Conference on Language Resources and Evaluation. Paris: European Language Resources Association.P.233-236. Génova, Italia. 2006

Francopoulo, Gil; Bel, Núria; George, Monte; Calzolari, Nicoletta; Monachini, Monica. "Lexical Markup Framework (LMF) for NLP Multilingual Resources ". Proceedings of the Workshop on Multilingual Language Resources and Interoperability. Sydney, Australia: Association for Computational Linguistics. http://www.aclweb.org/anthology/W/W06/W06-1001. P.1-8, Sidney,Australia. 2006

2004

Montserrat Marimon, Núria Bel. Lexical Entry Templates for Robust Deep Parsing, LREC 2004 Fourth International Conference On Language Resources And Evaluation. LREC 2004 Proceedings, ISBN: 2-9517408-1-6, P. 2209-2212. Lisboa (PORTUGAL) 2004

Núria Bel, Cornelius H.A. Koster, Marta Villegas. Cross-effective cross-lingual document classification. LREC 2004 Fourth International Conference On Language Resources And Evaluation. LREC 2004 Proceedings, ISBN – 2-9517408-1-6, p. 1915-1918 Lisboa (PORTUGAL), 2004

Núria Bel, Corpus representativeness for syntactic information acquisition, 42nd Annual Meeting of the Association for Computational Linguistics. ACL Companion Volume to the Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, ISBN 1-932432-33-7. P.138-141, Barcelona (ESPAÑA), 2004.

2003

Nuria Bel, Cornelis H.A. Koster, Marta Villegas. Cross-Lingual Text Categorization. 7th.European Conference on Research and Advanced Technology for Digital Libraries (ECDL 2003). Traugott Koch, Ingeborg Sølvberg (Eds.): Research and Advanced Technologyfor Digital Libraries, 7th European Conference, ECDL 2003, Trondheim, Norway, August 17-22, 2003, Proceedings. Lecture Notes in Computer Science 2769 Springer 2003, ISBN 3-540-40726-X P.126-139. Trondheim  (NOR – NORUEGA). 2003

Montserrat Marimon, Núria Bel. A Hybrid NLP System for Natural Language Interfaces.
International Conference Recent Advances in Natural Language Processing
. G.Angelova, K Bontcheva, R. Mitkov, N. Nicolov, N. Nicolov(Eds.): RANLP Proceedings, ISBN 954-90906-6-3. P.
250-254. Borovets  (BULGARIA). 2003

 

 

2002

Atkins, S., Bel, N., Bertagna, F., Bouillon, P., Calzolari, N., Fellbaum, C., Grishman, R., Lenci, A., MacLeod, C., Palmer, M., Thurmair, G., Villegas, M. & Zampolli, A. (2002). 'From Resources to Applications. Designing the Multilingual ISLE Lexical Entry' . In Proceedings Third International Conference on Language Resources and Evaluation. Las Palmas de Gran Canaria, 2002.

Bel, N., Caminero, J., Hernández, L., Marimon, M., Morlesín, J.M., Otero, J.M., Relaño, J., Rodríguez, M.C., Ruz, P.M. & Tapias, D. (2002). 'Design and Evaluation of a SLDS for E-Mail Access through the Telephone'. In Proceedings Third International Conference on Language Resources and Evaluation. Las Palmas de Gran Canaria, 2002.

Villegas, M. & Bel, N. (2002). 'From DTD to relational dB. An automatic generation of a lexicographical station out off guidelines'. In Proceedings of the Third International Conference on language Resources and Evaluation. Las Palmas de Gran Canaria, 2002.

2001

Calzolari, N., Lenci, A., Zampolli, A,. Bel, N., Villegas, M., & Thurmair, G. (2001). 'The ISLE in the Ocean Transatlantic Standards for Multilingual Lexicons (with an Eye to Machine Translation)'. In Proceedings of the MT Summit VIII. Santiago de Compostela, 2001.

Marimon, M., Porta, J., Bel, N. (2001). 'On Distributing the Analysis Process of a Broad-Coverage Unification-Based Grammar'. In the 3rd EuroConference Recent Advances in NLP (RANLP 2001). Tzigov Chark, Bulgaria.

2000

Lenci, A., Bel, N., Busa, F., Calzolari, N., Gola, E., Monachini, M., Ogonowski, A., Peters, I., Peters, W., Ruimy, N., Villegas, M., Zampolli, A. (2000). 'SIMPLE: A General Framework for the Development of Multilingual Lexicons'. In International Journal of Lexicography, Vol 13.

Villegas, M., Bel, N. et al. (2000). 'Multilingual Linguistics Resources: From Monolingual Lexicons to Bilingual Interrelated Lexicons'. In Proceedings of the Second International Conference on Language Resources and Evaluation. Athens, 2000.

 

(1987- 1999)

 

Maegard, B., N. Bel, B. Dorr, E. Hovy, K. Knight, H. Iida, Ch. Boitet, B. Maegaard, Y. Wilks (1999) ‘Machine Translation’, In Hovy, E. & N. Ide (eds.): Multilingual Information Management: Current Levels and Future Abilities, Linguistica Computazionale, vol. XIV-XV, 81-103, Pisa, 1999.  http://www.cs.cmu.edu/~ref/mlim/index.html

Villegas, M., Brosa, I. & Bel, N. (1998). 'El léxico PAROLE del español'. Actas del XIV Congreso de la SEPLN, Septiembre 1998.

Bel, N.; Marimon, M.; Porta, J.: 1996, 'Etiquetado morfosintáctico de corpus en el proyecto MULTEXT'. Actas del XXVI Simposio de la Sociedad Española de Lingüística, Madrid.

Bel, N. & Villegas, M. (1996). 'Clíticos, un Análisis en HPSG'. In Actas del II Congreso Nacional de Lingüística General. Granada 1996

Bel, N. & Villegas, M. (1996). 'Bounded Dependencies in Spanish'. In II International Conference on Mathematical Linguistics (ICML'96). Tarragona 1996.

Bel N.& Melero, M. (1995), "TRADE (MLAP93/003)". Boletín de la Sociedad Española para el Procesamiento del Lenguaje Natural (nº16).Donostia. Abril 1995

Bel, N. & Villegas, M. (1994). 'El Fenómeno del Control y la Complementación Verbal en HPSG para el Español'. In Actas del I Congreso de Lingüística General. Valencia 1994.

BEL, N. (1993): El procesamiento de la información semántica en las oraciones con "se" en castellano, Tesis doctoral, Universidad de Barcelona.

BADIA, T., BEL, N. & VIDAL, J. (1991): Nuevas perspectivas en el proyecto EUROTRA, Boletín de la Sociedad Española para el Procesamiento del Lenguaje Natural, 10, 7-21.

BEL, N. (1991): El sistema de traducción automática EUROTRA, en Actas del Simposio de la Lengua Española. Lengua y Tecnología, Pabellón de España, Madrid.

BALARI, S. & BEL, N. (1990): Se impersonal: por qué no es un clítico de sujeto, Actas del XX Congreso de la Sociedad Española de Lingüística, Ed. Gredos, Madrid.

BALARI, S., BEL, N. & GILBOY, E. (1989): Unbounded dependencies in machine translation, en BUSEMANN, S., HAUENSCHILD, C. & UMBACH, C. (comps.) Views of the Syntax/Semantics Interface, KIT Report 74, TU, Berlin.

BEL, N. (1989a): Desambiguación monolingüe y traducción automática, Boletín de la Sociedad Española para el Procesamiento del Lenguaje Natural, 7, 175-180.

BEL, N. (1989b): Pronominal constructions and argument structure, en The Eurotra Reference Manual 6.0, Commission of the European Communities B. P., Luxembourg, págs. 20-39.