The goal of the feature extraction method presented in this paper was to obtain a concise, robust, and invariant description of image content for image retrieval. The solution of this problem is chosen in the form of a visual attention operator, which can measure the saliency level of image fragments and can select a set of most salient image objects (feature vectors) for concise image description. The proposed operator, called image relevance function, is a multi-scale non-linear matched filter, whose local maxima provide the most salient image locations. A feature vector containing both local shape features and intensity features is extracted and normalized at each salient maximum point of the relevance function. The testing results of this method for retrieval of synthetic and real database images are provided.