Since a content-based image retrieval (CBIR) system services people, its image characterization and similarity measure must closely follow perceptual characteristics. In this study, we enumerate a few psychological and physiological invariants and show how they can be considered by a CBIR system. We propose distance functions to measure perceptual similarity for color, shape, and spatial distribution. In addition, we believe that an image search engine should model after our visual system, which adjusts to the environment and adapts to the visual goals. We show that we can decompose our visual front-end into filters of different functions and resolutions. A pipeline of filters can be dynamically constructed to meet the requirement of a search task and to adapt to individuals' search objectives.