Content-based retrieval (CBR) algorithms have been seen as an auspicious access method for digital photo collections replacing sooner or later the traditional text-based methods. Unfortunately, we have very little evidence about the usefulness of these algorithms in real user needs and contexts. One problem is that appropriately designed test collections are not available even for the basic testing of the CBR algorithms. This paper proposes a task-oriented evaluation framework for CBR algorithms and discusses the concept of similarity as a key to well established evaluation measures. The empirical part of the paper focuses on the analysis of user perceived similarity of photos in a realistic, but simulated search and work context. The results show that selection of potential photos while browsing thumbnail images is based on fairly concrete criteria perceivable by a glance. This is good news for the developers of CBR algorithms. The attributes such as the number of persons in the photo, shooting distance and colors, but also composition, cropping and background were exploited at this stage. When examining enlarged photos, the test persons focused on the facial expressions and gestures of persons, on actions taking place and on atmosphere in the photo. The experiences gained from the empirical study suggest that the work task derived, user perceived similarity assessments can be applied in the proposed evaluation framework for CBR algorithms.