HIPERGRAPH

High Performance processing for large data sets represented as Graphs (HIPERGRAPH)

Each day we generate more and more data, ranging from the Web to enterprise databases. Dealing with this excessive amount of data is the main motivation for this project. Although there are good techniques to analyze data stored in relational databases or plain files, the real potential is in studying the relations within or among different data.

The most general (and natural) representation of these relations is a graph where the nodes are data objects and the edges are the relations among them. This graph structure can be present in different ways ranging from explicit (e.g. as pages and hyperlinks on the web) to implicit (e.g. co-authorship in bibliographical networks).

The main goal of this project is to develop new techniques for representing, visualizing, compressing, indexing and mining graphs. To achieve this goal we focus on two dimensions: (1) the functional needs of data processing, and (2) the practical requirements for applying such processing on large volumes of data. The functional aspect includes the development of novel techniques for integration, search, mining, and visualization of data coming from different data sources as well as the preservation of privacy. The practical aspect involves the study of approaches for compression and indexing, parallelization, distributed processing, memory caching, multi-core processing, and the importance of storage.

Ultimately, the results of this project can be applied to data from a wide range of application including the Web, (implicit) social networks, geographical databases, text collections, and medical data.

Partners

The HIPERGRAPH project is a collaboration between

and is funded by the Ministry of Science and Innovation of Spain (MICINN) (grant number TIN2009-14560-C03-01/02/03).

Publications related to the project

High correlation between Incoming and Outgoing Activity: a distinctive property of OSN?
Diego Saez-Trumper, David Nettleton and Ricardo Baeza-Yates
AAAI Conference on Weblogs and Social Media (2011)
From Total Hits to Unique Visitors Model for Election's Forecasting
Diego Saez-Trumper, Wagner Meira Jr. and Virgilio Almeida
ACM Conference on Web Science (2011)
Microblogging without Borders: Differences and Similarities
Ruth Garcia, Barbara Poblete, Marcelo Mendoza and Alejandro Jaimes
ACM Conference on Web Science (2011)
Social Based Layouts for the Increase of Locality in Graph Operations
Arnau Prat-Pérez, David Dominguez-Sal and Josep L. Larriba-Pey
Conference on Database Systems for Advanced Applications (DASFAA), Hong Kong (2011)
Do All Birds Tweet the Same? Characterizing Twitter Around the World
Ruth Garcia, Marcelo Mendoza, Barbara Poblete and Alejandro Jaimes
ACM Conference on Information and Knowledge management (2011)
A model for automatic generation of multi-partite graphs from arbitrary data
Ricardo Baeza-Yates, Nieves Brisaboa and Josep L. Larriba-Pey
WAIM'10 Proceedings of the 2010 International Conference of Web-Age Information Management (2010)

Web Science and Social Computing Research Group

HIPERGRAPH

High Performance processing for large data sets represented as Graphs (HIPERGRAPH)

Partners

Publications related to the project