Thesis linked to the implementation of the María de Maeztu Strategic Research Program.

Open access to PhD thesis carried out at the Department can be found at TDX

Please visit these pages for information on our PhD, MSc and BSc programs.


Back Scientific Text Mining and Summarization Services

Scientific Text Mining and Summarization Services

Scientific Text Mining and Summarization Services
Scientific Text Mining Platform (ScienTMin), a set of integrated tools to offer services for the analysis, annotation, extraction, and summarization of information from scientific texts.

Nowadays, researchers, publishers, funding institutions, as well as any other interested actor, are overwhelmed by the amount of scientific contents available online. To effectively take advantage of this information, automated approaches to mine, interlink and visualize scientific documents are fundamental. These tools should go beyond basic keyword-based searches for information from bibliographic records, aiming at extracting and interlinking fine-grained semantically rich data from the whole textual contents of scientific articles.

To face this challenge we propose ScienTMin, a Web framework that enables a wide range of services for content analyses and aggregations of scientific publications thanks to Text Mining, Summarization, Retrieval and Data Visualization techniques. By continuously digesting articles in several data formats, including PDF, ScienTMin will extract and enrich their content thanks to both deep semantic and linguistic analyses and by retrieving complementary information from online sources (academic social networks, social media accounts, Web portals of universities, personal Web pages of researcher, etc.).

ScienTMin relies on the technology implemented in the Dr Inventor Text Mining Library (EU Project Dr Inventor and the Maria de Maeztu project “Mining the Knowledge of Scientific Publications”) and the SUMMA summarization software (implemented by Dr Saggion and currently licensed by UPF). ScienTMin will provide useful services to extract, condense, and navigate scientific contents. ScienTMin will be useful for organization and individual researchers who need to explore huge volumes of scientific data and be aware of current and past developments in a given field by providing services that produce state of the art reports, assess the value of scientific publications, recommend bibliography, and answer scientific question. By analysing the contents associated to authors, ScienTMin will be able to recommend experts in given areas and support decision making (e.g. hiring). These are just a bunch of services which are possible.