Data mining techniques that are successful in transaction and text data may not be simply applied to image data that contain high-dimensional features and have spatial structures. It is not a trivial task to discover meaningful visual patterns in image databases, because the content variations and spatial dependency in the visual data greatly challenge most existing methods. This paper presents a novel approach to coping with these difficulties for mining meaningful visual patterns. Specifically, the novelty of this work lies in the following new contributions: (1) a principled solution to the discovery of meaningful itemsets based on frequent itemset mining; (2) a self-supervised clustering scheme of the high-dimensional visual features by feeding back discovered patterns to tune the similarity measure through metric learning; and (3) a pattern summarization method that deals with the measurement noises brought by the image data. The experimental results in the real images show that our method can discover semantically meaningful patterns efficiently and effectively.