High Performance processing for large data sets represented as Graphs (HIPERGRAPH)

Each day we generate more and more data, ranging from the Web to enterprise databases. Dealing with this excessive amount of data is the main motivation for this project. Although there are good techniques to analyze data stored in relational databases or plain files, the real potential is in studying the relations within or among different data.

The most general (and natural) representation of these relations is a graph where the nodes are data objects and the edges are the relations among them. This graph structure can be present in different ways ranging from explicit (e.g. as pages and hyperlinks on the web) to implicit (e.g. co-authorship in bibliographical networks).

The main goal of this project is to develop new techniques for representing, visualizing, compressing, indexing and mining graphs. To achieve this goal we focus on two dimensions: (1) the functional needs of data processing, and (2) the practical requirements for applying such processing on large volumes of data. The functional aspect includes the development of novel techniques for integration, search, mining, and visualization of data coming from different data sources as well as the preservation of privacy. The practical aspect involves the study of approaches for compression and indexing, parallelization, distributed processing, memory caching, multi-core processing, and the importance of storage.

Ultimately, the results of this project can be applied to data from a wide range of application including the Web, (implicit) social networks, geographical databases, text collections, and medical data.

Partners

The HIPERGRAPH project is a collaboration between

and is funded by the Ministry of Science and Innovation of Spain (MICINN) (grant number TIN2009-14560-C03-01/02/03).

 

Publications related to the project