[TEXT] The Online Conversation Threads Repository (Slashdot, Barrapunto, Wikipedia talk)
List of results published directly linked with the projects co-funded by the Spanish Ministry of Economy and Competitiveness under the María de Maeztu Units of Excellence Program (MDM-2015-0502).
List of publications acknowledging the funding in Scopus.
The record for each publication will include access to postprints (following the Open Access policy of the program), as well as datasets and software used. Ongoing work with UPF Library and Informatics will improve the interface and automation of the retrieval of this information soon.
The MdM Strategic Research Program has its own community in Zenodo for material available in this repository as well as at the UPF e-repository
The Online Conversations Threads Repository. V. Gómez, A. Kaltenbrunner and D. Laniado
http://repositori.upf.edu/handle/10230/26270
This repository contains datasets with online conversation threads collected and analyzed by different researchers. Currently, you can find datsets from different news aggregators (Slashdot, Barrapunto) and the English Wikipedia talk pages. Slashdot conversations (Aug 2005 - Aug 2006) Online conversations generated at Slashdot during a year. Posts and comments published between August 26th, 2005 and August 31th, 2006. For each discussion thread: sub-domains, title, topics and hierarchical relations between comments. For each comment: user, date, score and textual content. This dataset is different from the Slashdot Zoo social network (it is not a signed network of users) contained in the SNAP repository and represents the full version of the dataset used in the CAW 2.0 - Content Analysis for the WEB 2.0 workshop for the WWW 2009 conference that can be found in several repositories such as Konect Barrapunto conversations (Jan 2005 - Dec 2008) Online conversations generated at Barrapunto (Spanish clone of Slashdot) during three years. For each discussion thread: sub-domains, title, topics and hierarchical relations between comments. For each comment: user, date, score and textual content Wikipedia (2001 - Mar 2010) Data from articles discussions (talk) pages of the English Wikipedia as of March 2010. It contains comments on about 870,000 articles (i.e. all articles which had a corresponding talk page with at least one comment), in total about 9.4 million comments. The oldest comments date back to as early as 2001.
When using these data, please use the following references.
The datasets were first analyzed in:
[1] V. Gómez, A. Kaltenbrunner, V. López (2008). Statistical analysis of the social network and discussion threads in Slashdot. Proceedings of the 17th International World Wide Web Conference
[2] V. Gómez, H. J. Kappen, N. Litvak, A. Kaltenbrunner (2013). A likelihood-based framework for the analysis of discussion threads. World Wide Web 16(5-6):645-675 (arXiv version here)
[3] D. Laniado, R. Tasso, Y. Volkovich, A. Kaltenbrunner (2011). When the Wikipedians Talk: Network and Tree Structure of Wikipedia Discussion Pages. Fifth International AAAI Conference on Weblogs and Social Media