Benchmarking has always been a crucial problem in content-based images retrieval systems (CBIRSs). A key issue is the lack of a common access method to retrieval systems, such as SQL for relational databases. The Multimedia Retrieval Mark-up Language (MRML), described in this article, solves this problem by standardizing access to CBIRSs. Other difficult problems are also addressed, such as obtaining relevance judgments and choosing a database for performance comparison. We present fully automated benchmark for CBIRSs based on MRML, which can be adapted to any image database and almost any kind of relevance judgment. The test evaluates the performance of positive and negative relevance feedback, which can be generated automatically from the relevance judgments. To illustrate our purpose, a freely available, non-copyright collection is used to evaluate our CBIRS, Viper. All scripts described here are freely available for download.