HIPERGRAPH
High Performance processing for large data sets represented as Graphs (HIPERGRAPH)
Each day we generate more and more data, ranging from the Web to enterprise databases. Dealing with this excessive amount of data is the main motivation for this project. Although there are good techniques to analyze data stored in relational databases or plain files, the real potential is in studying the relations within or among different data.
The most general (and natural) representation of these relations is a graph where the nodes are data objects and the edges are the relations among them. This graph structure can be present in different ways ranging from explicit (e.g. as pages and hyperlinks on the web) to implicit (e.g. co-authorship in bibliographical networks).
The main goal of this project is to develop new techniques for representing, visualizing, compressing, indexing and mining graphs. To achieve this goal we focus on two dimensions: (1) the functional needs of data processing, and (2) the practical requirements for applying such processing on large volumes of data. The functional aspect includes the development of novel techniques for integration, search, mining, and visualization of data coming from different data sources as well as the preservation of privacy. The practical aspect involves the study of approaches for compression and indexing, parallelization, distributed processing, memory caching, multi-core processing, and the importance of storage.
Ultimately, the results of this project can be applied to data from a wide range of application including the Web, (implicit) social networks, geographical databases, text collections, and medical data.
Partners
The HIPERGRAPH project is a collaboration between
- Web Research Group (WRG) at Universitat Pompeu Fabra (UPF)
- Data Management Group (DAMA) at Universitat Polytechnica de Catalunya (UPC)
- Laboratoria de Bases de Datos (LBD) at Universidade da Coruña (UdC)
and is funded by the Ministry of Science and Innovation of Spain (MICINN) (grant number TIN2009-14560-C03-01/02/03).
Publications related to the project
- High correlation between Incoming and Outgoing Activity: a distinctive property of OSN?
AAAI Conference on Weblogs and Social Media (2011) - From Total Hits to Unique Visitors Model for Election's Forecasting
ACM Conference on Web Science (2011) - Microblogging without Borders: Differences and Similarities
ACM Conference on Web Science (2011) - Social Based Layouts for the Increase of Locality in Graph Operations
Conference on Database Systems for Advanced Applications (DASFAA), Hong Kong (2011) - Do All Birds Tweet the Same? Characterizing Twitter Around the World
ACM Conference on Information and Knowledge management (2011) - A model for automatic generation of multi-partite graphs from arbitrary data
WAIM'10 Proceedings of the 2010 International Conference of Web-Age Information Management (2010)