Search results for key=MMM2003 : 1 match found.

Search my entire BibTeX database

Abstract BibTeX entry Postscript PDF Powerpoint

Refereed full papers (journals, book chapters, international conferences)

2003

Henning Müller, Wolfgang Müller, Stéphane Marchand-Maillet, David McG. Squire and Thierry Pun, A Framework for Benchmarking in CBIR, Multimedia Tools and Applications, 21, 1, pp. 55-73, September 2003. (Special Issue: Best Papers of the ACM Multimedia 2001 Workshop on Multimedia Information Retrieval)
Content-based image retrieval (CBIR) has been a very active research area for more than ten years. In the last few years the number of publications and retrieval systems produced has become larger and larger. Despite this, there is still no agreed objective way in which to compare the performance of any two of these systems. This fact is blocking the further development of the field since good or promising techniques can not be identified objectively, and the potential commercial success of CBIR systems is hindered because it is hard to establish the quality of an application.
We are thus in the position in which other research areas, such as text retrieval or database systems, found themselves several years ago. To have serious applications, as well as commerical success, objective proof of system quality is needed: in text retrieval the TREC benchmark is a widely accepted performance measure; in the transcatoin processing field for databases it is the TPC benchmark that has wide support.
This paper describes a framework that enables the creation of a benchmark for CBIR. Parts of this framework have already been developed and systems can be evaluated against a small, freely-available database via a web interface. Much work remains to be done with respect to making available large, diverse image databases and obtaining relevance judgments for those large databases. We also need to establish an independent body, accepted by the entire community, that would organize a benchmarking event, give out official results and update the benchmark regularly. The Benchathlon could get this role if it manages to gain the confidence of the field. This should also prevent the negative effects, e.g. ``benchmarketing'', experienced with other benchmarks, such as the TPC predecessors.
This paper sets out our ideas for an open framework for performance evaluation. We hope to stimulate discussion on evaluation in image retrieval so that systems can be compared on the same grounds. We also identify query paradigms beyond query by example (QBE) that may be integrated into a benchmarking framework, and we give examples of application-based benchmarking areas.