Urbano J, Marrero M. Toward Estimating the Rank Correlation between the Test Collection Results and the True System Performance. International ACM SIGIR Conference on Research and Development in Information Retrieval, 2016

We develop a large number of software tools and hosting infrastructures to support the research developed at the Department. We will be detailing in this section the different tools available. You can take a look for the moment at the offer available within the UPF Knowledge Portal, the innovations created in the context of EU projects in the Innovation Radar and the software sections of some of our research groups:

Artificial Intelligence

Nonlinear Time Series Analysis

Downloads

Web Research

Dyswebxia

Music Technology

Interactive Technologies

Barcelona MedTech

GitHub

Natural Language Processing

GitHub
Resources (datasets, software and other material)

Nonlinear Time Series Analysis

Downloads

UbicaLab

GitHub

Wireless Networking

GitHub

Educational Technologies

GitHub

Back Urbano J, Marrero M. Toward Estimating the Rank Correlation between the Test Collection Results and the True System Performance. International ACM SIGIR Conference on Research and Development in Information Retrieval, 2016

Urbano J, Marrero M. Toward Estimating the Rank Correlation between the Test Collection Results and the True System Performance. International ACM SIGIR Conference on Research and Development in Information Retrieval, 2016

The Kendall and AP rank correlation coefficients have become mainstream in Information Retrieval research for comparing the rankings of systems produced by two different evaluation conditions, such as di erent e ectiveness measures or pool depths. However, in this paper we focus on the expected rank correlation between the mean scores observed with a test collection and the true unobservable means under the same conditions. In particular, we propose statistical estimators of and AP correlations following both parametric and non-parametric approaches, and with special emphasis on small topic sets. Through large scale simulation with TREC data, we study the error and bias of the estimators. In general, such estimates of expected correlation with the true ranking may accompany the results reported from an evaluation experiment, as an easy to understand gure of reliability. All the results in this paper are fully reproducible with data and code available online.

Keywords: Evaluation; Test Collection; Correlation; Kendall; Average Precision; Estimation

Downloads:

Full text: PDF
Citation: BibTEX
Code and data: GitHub

Link: http://julian-urbano.info/publications/066-toward-estimating-rank-correlation-test-collection-results-true-system-performance.html

DTIC MdM Strategic Program: Artificial and Natural Intelligence for ICT and beyond

Urbano J, Marrero M. Toward Estimating the Rank Correlation between the Test Collection Results and the True System Performance. International ACM SIGIR Conference on Research and Development in Information Retrieval, 2016

Related Assets